By Steve Endow
Uncle. I give up. I have lost the fight.
Email has won. I am defeated.
What was once a great tool for communication has become an overbearing hassle that has destroyed my productivity.
I receive around 50 to 75 emails every weekday. On a very bad day, I'll hit 100 emails. I've determined that 100 inbound emails a day is completely unmanageable for me. With my current processes (or lack thereof), I cannot possibly be productive with that many emails coming at me. The number of responses and tasks from 100 emails prevents me from doing any other work.
If all I did was "manage" my email all day, and do nothing else, I could probably wrangle my Inbox, but I wouldn't get any "real work" done. When I focus on doing real work and ignore my email for a day, my Inbox explodes.
It isn't just the emails themselves. It's also that many of the emails have some type of commitment attached to them.
"Hey Steve, please review this thread of 30 cryptic replies below and let me know what you think."
"Here's the 15 page document I created, please proofread it."
"When can you schedule a call?"
"We are getting an error. What is causing this?"
"Here are links to a forum post and KB article. Does this explain the error I'm getting?"
"How many hours will it take you to do X?"
"I sent you an email earlier? Did you get my email? Can you reply to my email?"
People seem to expecting a relatively prompt reply to their emails--because they think their request is most important, naturally, and because I don't have any other work to do, right?
This week, a link to this article appeared in my Twitter feed:
One-Touch to Inbox Zero
By Tiago Forte of Forte Labs
I have heard of Inbox Zero previously, but I had dismissed it as a bit of a gimmick without fully understanding it.
This time, I actually read the article by Tiago Forte and his explanation finally clicked for me. His examples and analogies made sense, and his emphasis on email as the first step of a more comprehensive communication and productivity workflow helped me build a new interpretation of Inbox Zero.
My blog has moved! Please visit the new blog at: https://blog.steveendow.com/ I will no longer be posting to Dynamics GP Land, and all new posts will be at https://blog.steveendow.com Thanks!
Friday, December 29, 2017
Thursday, December 21, 2017
Accepting help from experts and offering help as an expert
By Steve Endow
I've recently had two situations where someone asked for help with Dynamics GP, and when I provided guidance, the requester indicated that my suggestions were not relevant. Without considering my suggestions or trying them, the requester immediately ruled them out.
They were simple suggestions, such as "please try making this change and perform the process again to see if that resolves the error", or "have you traced your source data to verify that it isn't the cause of the incorrect transaction that was imported?".
"That can't be the cause." was one response.
"My custom stored procedure that imports data into GP verifies everything, so I know it worked properly." was another response.
Another common response I receive when troubleshooting issues is, "We've already checked that and it's not the cause of the problem."
I don't consider myself an "expert" at anything, but there are some topics where I've done enough work to have a certain level of knowledge, intuition, and skills such that I'm generally able to narrow down causes to problems, and typically know some good places to start looking for causes. I have enough successes solving problems in certain areas that it seems like my approach generally works.
When someone asks for help and then immediately dismisses my initial recommendations without even trying them, how can I help them? Maybe they don't know who I am or what experience I have, and they're skeptical of my suggestions. What can I do then?
Do I gently explain that I've worked with over 400 customers in this specific domain, and that my anecdotal statistics would not support the assertion that their integration is infallible or that Dynamics GP is at fault? Is it my job to convince them that I tend to have a fairly good grasp of the subject matter and that they should reconsider my suggestion? Is there any point in arguing with someone who has asked for help, but isn't accepting my help?
"Experts" don't know everything and can't always immediately pinpoint causes or solutions. But if they ask questions, ask for more information, or ask you to test something, isn't it in your best interest to at least try working with them? If you're not willing to work with an expert, what are your alternatives?
Instead of immediately ruling out suggestions, welcome them as opportunities to learn. Collect new data. Make new assessments. Understand what they are thinking.
Be inquisitive and curious and humble. Don't be defensive or righteous. This applies to the person asking for help, as well as the expert being asked.
I've recently had two situations where someone asked for help with Dynamics GP, and when I provided guidance, the requester indicated that my suggestions were not relevant. Without considering my suggestions or trying them, the requester immediately ruled them out.
They were simple suggestions, such as "please try making this change and perform the process again to see if that resolves the error", or "have you traced your source data to verify that it isn't the cause of the incorrect transaction that was imported?".
"That can't be the cause." was one response.
"My custom stored procedure that imports data into GP verifies everything, so I know it worked properly." was another response.
Another common response I receive when troubleshooting issues is, "We've already checked that and it's not the cause of the problem."
I don't consider myself an "expert" at anything, but there are some topics where I've done enough work to have a certain level of knowledge, intuition, and skills such that I'm generally able to narrow down causes to problems, and typically know some good places to start looking for causes. I have enough successes solving problems in certain areas that it seems like my approach generally works.
When someone asks for help and then immediately dismisses my initial recommendations without even trying them, how can I help them? Maybe they don't know who I am or what experience I have, and they're skeptical of my suggestions. What can I do then?
Do I gently explain that I've worked with over 400 customers in this specific domain, and that my anecdotal statistics would not support the assertion that their integration is infallible or that Dynamics GP is at fault? Is it my job to convince them that I tend to have a fairly good grasp of the subject matter and that they should reconsider my suggestion? Is there any point in arguing with someone who has asked for help, but isn't accepting my help?
"Experts" don't know everything and can't always immediately pinpoint causes or solutions. But if they ask questions, ask for more information, or ask you to test something, isn't it in your best interest to at least try working with them? If you're not willing to work with an expert, what are your alternatives?
Instead of immediately ruling out suggestions, welcome them as opportunities to learn. Collect new data. Make new assessments. Understand what they are thinking.
Be inquisitive and curious and humble. Don't be defensive or righteous. This applies to the person asking for help, as well as the expert being asked.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Wednesday, December 20, 2017
Building a Dynamics GP test environment on a B-series Azure Virtual Machine: Not so fast!
By Steve Endow
With the recent release of Dynamics GP 2018, I wanted to setup a new virtual machine that I could use for testing and development.
I currently run my own Hyper-V server, which serves up 20 different virtual machines, and has been very low cost and is extremely fast. I would be happy to outsource my VMs to the "cloud", but having looked into the cost several times over the last few years, it just isn't economical for me. I previously estimated it would cost me over $300 a month to host just a few VMs. That cost, on top of having to severely limit the number of VMs I can run just didn't make sense for hosting my internal development VMs.
But recently fellow MVP Beat Bucher told me about a new Azure VM that was lower cost: the B-Series "burstable" VMs.
https://azure.microsoft.com/en-us/blog/introducing-b-series-our-new-burstable-vm-size/
Beat explained that he was able to run two of the B4ms machines continuously for a cost of roughly $150 per month. I was intrigued.
After reviewing the different sizes, I setup a new B2ms virtual machine on Azure, running Windows Server. The provisioning process was very simple, easy, and fast, and I had a VM a few minutes later.
I then downloaded and installed SQL Server and SQL Management Studio. There were a few subtle hints that something wasn't quite right, but at the time the machine seemed great.
I then downloaded the 1.6 GB Dynamics GP 2018 DVD as a zip file. Like when I downloaded SQL Server, I noticed that when I downloaded the GP 2018 zip file, the Chrome browser didn't show the download status. When I opened Windows File Explorer, nothing showed up in the download directory during the download or after the downloaded appeared to complete. It took quite a while for Windows File Explorer to show the downloaded file.
I noticed Windows File Explorer seemed unresponsive as well. It just didn't feel right, but I hadn't yet pieced together the clues.
I then tried to unzip the GP 2018 file. That's when it was clear something was wrong.
This status window appeared, showing that it would take over 30 minutes to extract the 1.6 zip file. What?? 1.36MB/s?
I then did dozens of other tests, simply copying large (1GB+) files on the C: drive and between the C: and D: temporary drive. The performance was abysmal.
After several tests, I noticed that on average, the file copies were clearly being throttled around 21-22MB/s.
What in the world was going on?
The B-Series VMs are supposed to have "Premium SSD" storage, and 21MB/s is definitely not SSD performance.
I submitted an Azure support case and after several days, received a response. The support rep admitted that because the B-Series VMs were relatively new, he didn't have much experience with them and would need me to do some tests to narrow down the cause. No problem.
He first had me "redeploy" the Azure VM, which apparently pushes the VM to a new "node" or physical host machine. I completed that process and tested again, but got the same results: file copies were still painfully slow.
He then had me install the Performance Insights plugin on the VM, which apparently runs some automated performance tests and automatically submits the results to the support case (a very cool feature). I completed that process and a few days later, he emailed me with an explanation for the slow disk performance I was seeing.
This is the critical information that I overlooked when selecting the B-Series VM:
Notice that the B2ms size has a maximum disk speed of 22.5 MB/s. That is the maximum.
The B4ms offers 35MB/s and the B8ms tops out at 50MB/s. 50 sounds a lot better than 22.5, but even 50MB/s is horrifically slow compared to any competent modern storage.
Even if you add an additional high performance Premium SSD, such as a 1023GB drive with 5,000 IOPS and 200MB/s throughput (which is VERY expensive), if it is attached to a B2ms VM, you will still be limited to 22.5 MB/s.
For comparison, my local Hyper-V server can copy files at 100MB/s from my NAS, and the limiting factor is the gigabit network connection between the NAS and the server, not my NAS or the SSDs in my server.
Local file copies on the SSDs on my Hyper-V server can be as high as 1GB/s!! It's so fast that I had a very hard time getting a screen shot while copying the 1.6GB Dynamics GP 2018 zip file.
If you are used to even half-decent disk performance on a server, can you live with 22.5 or 35 MB/s on an Azure B-Series VM?
And am I willing to spend an extra hour or two setting up an Azure B-Series VM, due to its brutally slow disk IO, for a Dynamics GP 2018 test environment? Am I confident that once I set it up and don't have to do many large file copies, that the disk performance will be sufficient for my needs?
Can SQL Server actually run well enough on a disk throttled at 22.5MB/s? Now that I see the disk specs, I am pretty sure that the B-Series was never intended to ever run SQL Server.
And I'm not willing to waste my time to find out. Those disk speeds are so slow that I am not confident that the B-Series VM will meet my needs even for a test + development server. Even if I used the B4ms, that's roughly $75 a month for a potentially painfully slow VM.
So, I have ruled out the B-Series Azure VMs for now, and would have to look at the "standard" VMs, which would likely still cost $150-$300 per month for 1-2 non-production VMs.
Since I have a very fast Hyper-V server in my office that can easily host 20 VMs with a marginal cost of $0 per month per VM, it seems that I will be sticking with an on premises server for at least a few more years.
With the recent release of Dynamics GP 2018, I wanted to setup a new virtual machine that I could use for testing and development.
I currently run my own Hyper-V server, which serves up 20 different virtual machines, and has been very low cost and is extremely fast. I would be happy to outsource my VMs to the "cloud", but having looked into the cost several times over the last few years, it just isn't economical for me. I previously estimated it would cost me over $300 a month to host just a few VMs. That cost, on top of having to severely limit the number of VMs I can run just didn't make sense for hosting my internal development VMs.
But recently fellow MVP Beat Bucher told me about a new Azure VM that was lower cost: the B-Series "burstable" VMs.
https://azure.microsoft.com/en-us/blog/introducing-b-series-our-new-burstable-vm-size/
Beat explained that he was able to run two of the B4ms machines continuously for a cost of roughly $150 per month. I was intrigued.
After reviewing the different sizes, I setup a new B2ms virtual machine on Azure, running Windows Server. The provisioning process was very simple, easy, and fast, and I had a VM a few minutes later.
I then downloaded and installed SQL Server and SQL Management Studio. There were a few subtle hints that something wasn't quite right, but at the time the machine seemed great.
I then downloaded the 1.6 GB Dynamics GP 2018 DVD as a zip file. Like when I downloaded SQL Server, I noticed that when I downloaded the GP 2018 zip file, the Chrome browser didn't show the download status. When I opened Windows File Explorer, nothing showed up in the download directory during the download or after the downloaded appeared to complete. It took quite a while for Windows File Explorer to show the downloaded file.
I noticed Windows File Explorer seemed unresponsive as well. It just didn't feel right, but I hadn't yet pieced together the clues.
I then tried to unzip the GP 2018 file. That's when it was clear something was wrong.
This status window appeared, showing that it would take over 30 minutes to extract the 1.6 zip file. What?? 1.36MB/s?
I then did dozens of other tests, simply copying large (1GB+) files on the C: drive and between the C: and D: temporary drive. The performance was abysmal.
After several tests, I noticed that on average, the file copies were clearly being throttled around 21-22MB/s.
What in the world was going on?
The B-Series VMs are supposed to have "Premium SSD" storage, and 21MB/s is definitely not SSD performance.
I submitted an Azure support case and after several days, received a response. The support rep admitted that because the B-Series VMs were relatively new, he didn't have much experience with them and would need me to do some tests to narrow down the cause. No problem.
He first had me "redeploy" the Azure VM, which apparently pushes the VM to a new "node" or physical host machine. I completed that process and tested again, but got the same results: file copies were still painfully slow.
He then had me install the Performance Insights plugin on the VM, which apparently runs some automated performance tests and automatically submits the results to the support case (a very cool feature). I completed that process and a few days later, he emailed me with an explanation for the slow disk performance I was seeing.
This is the critical information that I overlooked when selecting the B-Series VM:
Notice that the B2ms size has a maximum disk speed of 22.5 MB/s. That is the maximum.
The B4ms offers 35MB/s and the B8ms tops out at 50MB/s. 50 sounds a lot better than 22.5, but even 50MB/s is horrifically slow compared to any competent modern storage.
Even if you add an additional high performance Premium SSD, such as a 1023GB drive with 5,000 IOPS and 200MB/s throughput (which is VERY expensive), if it is attached to a B2ms VM, you will still be limited to 22.5 MB/s.
For comparison, my local Hyper-V server can copy files at 100MB/s from my NAS, and the limiting factor is the gigabit network connection between the NAS and the server, not my NAS or the SSDs in my server.
Local file copies on the SSDs on my Hyper-V server can be as high as 1GB/s!! It's so fast that I had a very hard time getting a screen shot while copying the 1.6GB Dynamics GP 2018 zip file.
If you are used to even half-decent disk performance on a server, can you live with 22.5 or 35 MB/s on an Azure B-Series VM?
And am I willing to spend an extra hour or two setting up an Azure B-Series VM, due to its brutally slow disk IO, for a Dynamics GP 2018 test environment? Am I confident that once I set it up and don't have to do many large file copies, that the disk performance will be sufficient for my needs?
Can SQL Server actually run well enough on a disk throttled at 22.5MB/s? Now that I see the disk specs, I am pretty sure that the B-Series was never intended to ever run SQL Server.
And I'm not willing to waste my time to find out. Those disk speeds are so slow that I am not confident that the B-Series VM will meet my needs even for a test + development server. Even if I used the B4ms, that's roughly $75 a month for a potentially painfully slow VM.
So, I have ruled out the B-Series Azure VMs for now, and would have to look at the "standard" VMs, which would likely still cost $150-$300 per month for 1-2 non-production VMs.
Since I have a very fast Hyper-V server in my office that can easily host 20 VMs with a marginal cost of $0 per month per VM, it seems that I will be sticking with an on premises server for at least a few more years.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Friday, November 10, 2017
Free Precipio SFTP file transfer and data export tool - New Version 1.41 released
By Steve Endow
I have released a new version of my free SFTP file transfer and data export tool for Dynamics GP.
The new version 1.41 can be downloaded from my web site:
http://precipioservices.com/sftp/
Version 1.41 includes the following enhancements:
Dynamics GP BlackLine Integration Upload
Dynamics GP Coupa Integration Upload
Dynamics GP IQ BackOffice Integration Upload
Dynamics GP SFTP Integration Upload
Dynamics GP SFTP File Transfer Upload
I have released a new version of my free SFTP file transfer and data export tool for Dynamics GP.
The new version 1.41 can be downloaded from my web site:
http://precipioservices.com/sftp/
Version 1.41 includes the following enhancements:
- Add support for optional SQLTimeout setting in config file to increase SQL command timeout
- Set default SQLTimeout to 60 seconds if setting is not present in config file
- Increase SFTP Connection Timeout from 5 seconds to 30 seconds, and Idle Timeout from 10 seconds to 30 seconds
The SQL Timeout setting allows for longer running queries, or queries that result in larger export files.
The SFTP Connection Timeout was increased to accommodate some SFTP servers that might not complete the connection process in 5 seconds.
If you use the SFTP application, please let me know! I'd love to hear how you are using it and if it is working well for you.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Dynamics GP BlackLine Integration Upload
Dynamics GP Coupa Integration Upload
Dynamics GP IQ BackOffice Integration Upload
Dynamics GP SFTP Integration Upload
Dynamics GP SFTP File Transfer Upload
Wednesday, November 1, 2017
Beware of UTC time zone on dates when importing data into Dynamics GP!
By Steve Endow
Prior to this year, I rarely had to deal with time zones when developing integrations for Dynamics GP.
The customer was typically using GP in a US time zone, the SQL Server was on premise in that time zone, and all of their data usually related to that same time zone. Nice and simple.
Dynamics GP then introduced the DEX_ROW_TS field to several tables, and I would regularly forget that field used a UTC timestamp. That was relatively minor and easy to work around.
But with the increasing popularity of Software As A Service (SaaS) platforms, I'm seeing more and more data that includes UTC timestamps. I didn't think too much about this until today, when I found an issue with how a SaaS platform provided transaction dates in their export files.
Here is a sample data from a file that contains AP Invoices:
2017-09-05T14:26:05Z
This is a typical date time value, provided in what I generically call "Zulu time" format. Apparently this format is defined in ISO 8601.
The format includes date and time, separated by the letter T, with a Z at the end, indicating that the time is based on the UTC time zone.
So why do we care?
Until today, I didn't think much of it, as my C# .NET code converts the full date time string to a DateTime value based on the local time zone, something like this:
string docDate = header["invoice-date"].ToString().Trim();
DateTime invoiceDate;
success = DateTime.TryParse(docDate, out invoiceDate);
if (!success)
{
Log.Write("Failed to parse date for invoice " + docNumber + ": " + docDate, true);
}
This seemed to work fine.
But after a few weeks of using this integration, the customer noticed that a few invoices appeared to have the incorrect date. So an 8/1/2017 invoice would be dated 7/31/2017. Weird.
Looking at the data this morning, I noticed this in the SaaS data file for the Invoice Date field:
2017-08-25T06:00:00Z
2017-08-21T06:00:00Z
2017-08-23T06:00:00Z
Do you see the problem?
The SaaS vendor is taking the invoice date that the user in Colorado enters, and is simply appending "T06:00:00Z" to the end of all of the invoice dates.
Why is that a problem?
Well, when a user in Colorado enters an invoice dated 8/25/2017, they want the invoice date to be 8/25/2017 (UTC-7 time zone). When the SaaS vendor adds an arbitrary time stamp of 6am UTC time, my GP integration will dutifully convert that date into 8/24/2017 11pm Colorado time.
For invoices dated 8/25, that may not matter too much, but if the invoice is dated 9/1/2017, the date will get converted to 8/31/2017 and post to the wrong fiscal period.
To make things even more fun, I found that the SaaS vendor is also storing other dates in local time.
2017-09-05T08:24:36-07:00
2017-09-05T08:26:22-07:00
2017-09-05T08:28:13-07:00
So I have to be careful about which dates I convert from UTC to local time, and which ones I truncate the time to just get the date, and which ones are local time. In theory, the .NET date parsing should handle the conversion properly, assuming the time zone is correct, but I now know that I have to keep an eye on the vendor data.
I will be contacting the vendor to have them fix the issue with the invoice dates--there is no good reason why they should be appending "T06:00:00Z" to dates.
Expect to see a lot more of this date format and related date issues as more customers adopt cloud-based solutions and services.
Prior to this year, I rarely had to deal with time zones when developing integrations for Dynamics GP.
The customer was typically using GP in a US time zone, the SQL Server was on premise in that time zone, and all of their data usually related to that same time zone. Nice and simple.
Dynamics GP then introduced the DEX_ROW_TS field to several tables, and I would regularly forget that field used a UTC timestamp. That was relatively minor and easy to work around.
But with the increasing popularity of Software As A Service (SaaS) platforms, I'm seeing more and more data that includes UTC timestamps. I didn't think too much about this until today, when I found an issue with how a SaaS platform provided transaction dates in their export files.
Here is a sample data from a file that contains AP Invoices:
2017-09-05T14:26:05Z
This is a typical date time value, provided in what I generically call "Zulu time" format. Apparently this format is defined in ISO 8601.
The format includes date and time, separated by the letter T, with a Z at the end, indicating that the time is based on the UTC time zone.
So why do we care?
Until today, I didn't think much of it, as my C# .NET code converts the full date time string to a DateTime value based on the local time zone, something like this:
string docDate = header["invoice-date"].ToString().Trim();
DateTime invoiceDate;
success = DateTime.TryParse(docDate, out invoiceDate);
if (!success)
{
Log.Write("Failed to parse date for invoice " + docNumber + ": " + docDate, true);
}
This seemed to work fine.
But after a few weeks of using this integration, the customer noticed that a few invoices appeared to have the incorrect date. So an 8/1/2017 invoice would be dated 7/31/2017. Weird.
Looking at the data this morning, I noticed this in the SaaS data file for the Invoice Date field:
2017-08-25T06:00:00Z
2017-08-21T06:00:00Z
2017-08-23T06:00:00Z
Do you see the problem?
The SaaS vendor is taking the invoice date that the user in Colorado enters, and is simply appending "T06:00:00Z" to the end of all of the invoice dates.
Why is that a problem?
Well, when a user in Colorado enters an invoice dated 8/25/2017, they want the invoice date to be 8/25/2017 (UTC-7 time zone). When the SaaS vendor adds an arbitrary time stamp of 6am UTC time, my GP integration will dutifully convert that date into 8/24/2017 11pm Colorado time.
For invoices dated 8/25, that may not matter too much, but if the invoice is dated 9/1/2017, the date will get converted to 8/31/2017 and post to the wrong fiscal period.
To make things even more fun, I found that the SaaS vendor is also storing other dates in local time.
2017-09-05T08:24:36-07:00
2017-09-05T08:26:22-07:00
2017-09-05T08:28:13-07:00
So I have to be careful about which dates I convert from UTC to local time, and which ones I truncate the time to just get the date, and which ones are local time. In theory, the .NET date parsing should handle the conversion properly, assuming the time zone is correct, but I now know that I have to keep an eye on the vendor data.
I will be contacting the vendor to have them fix the issue with the invoice dates--there is no good reason why they should be appending "T06:00:00Z" to dates.
Expect to see a lot more of this date format and related date issues as more customers adopt cloud-based solutions and services.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Tuesday, October 24, 2017
Why I don't accept foreign checks (aka North American banking is a mess)
By Steve Endow
Several years ago, I received a paper check, in the mail, from a Dynamics partner in Canada. The partner was paying my US dollar invoice, and thought they were doing me a favor by drafting the check from their US dollar bank account at their Canadian bank.
Send a check in US dollars to pay a USD invoice--makes sense, right?
Nosiree.
I attempted to deposit the check at my local Bank of America branch using the ATM. The ATM would not accept the check. So I went inside the bank, stood in line, and then told the teller I wanted to deposit the check. The teller looked at the check, and confusion ensued.
Eventually a manager came over and explained to me, with full confidence, and in no uncertain terms, that they were unable to accept the check. He explained that the problem was not that the check was from a Canadian bank. He said that the problem was that the Canadian check was issued in US Dollars. He claimed that because the country of origin did not match the check currency, the branch could not accept the check. That's the policy. (no it isn't)
So...how can I deposit the check?
The manager handed me a special envelope and a triplicate carbon copy form. He said I needed to fill out the form and mail it with the check to a super special obscure department at Bank of America called "Foreign Clean Collections"--whatever that means. Once the check is received by that department, it will review the check and coordinate with the foreign bank to get the funds transferred. This process will take 6-8 WEEKS.
You're kidding me, right? Nope.
So, being curious about this banking train wreck, I gave it a try. I filled out the form and mailed the USD $1,000 check off to the super special department.
A few weeks later, a deposit shows up in my account for $800. Yup, $200 less than the check. In addition to having to wait several weeks for the deposit, I was charged $200 in bank fees!
After that nightmare, I stopped accepting any foreign checks. I put a big red note on my invoice that says that I only accept credit cards and wire transfers from non-US customers. And guess what: That process has been working just fine for years.
This week, a Canadian partner didn't read my invoice, and didn't read my email with the invoice, and they mailed me a paper check. The check is from their Canadian bank, issued in US Dollars. Great.
So I contacted a colleague who regularly receives Canadian checks, and she said that she routinely deposits Canadian checks issued in USD at her local BofA branch without any issues. Huh.
But having paid my $200 entrance fee to the Bank of America Foreign Clean Collections Club, I wasn't about to just deposit this new check, wait several weeks, and see how much I get charged.
So I did the obvious thing: I called my local Bank of America branch.
First customer service rep: "Sorry, I don't deal with those things. Let me transfer you to our back office." Apparently the back office doesn't have voicemail and is out to lunch at 9am, as the phone rang for 3 minutes with no answer. I tried calling the branch back, but this time nobody answered and I got a voice response system. So the local bank branches are useless when inquiring about these things.
So I then called the main BofA customer service 800 number. I spoke with someone who tried very hard to help, but she was unable to find any information and her computer and phone were unable to contact the department who might be able to help. So she gave me the phone number to the Bank of America Foreign Exchange Call Center.
I then directly called the illustrious Foreign Exchange Call Center and spoke with someone who, for the first time, sounded like he understood the mysterious process of depositing foreign checks with Bank of America.
"Can I deposit this Canadian check drafted in US Dollars at my local California branch?", I asked
"Every check is reviewed on a case by case basis.", he replied
What? What does that even mean?
"Every check is reviewed on a case by case basis.", he replied
So you have no consistent policy about depositing foreign checks?
"Yes, we have a very consistent policy that I just explained to you. Every check is reviewed on a case by case basis.", he replied
After speaking with him for several minutes and apparently annoying him, here is my understanding of the official Bank of America policy / procedure for foreign checks.
1. Acceptance of a foreign check is completely up to the discretion of the BofA branch, and the inconsistent and incorrect training that a teller or branch manager may have received. The branch can simply say they don't accept foreign checks. Or they can conjure up an excuse as to why they can't accept the check, like "the country of origin does not match the check currency".
2. If the branch is willing to try to accept the check, they can scan the check in their "system". This "system" then determines if Bank of America is willing to accept the check at that branch. Apparently this involves super secret algorithms about my "relationship" with the bank, the physical check, the bank that issued the check, the country of origin, the currency, the amount, etc.
3. If the "system" determines that the branch can accept the specific check, apparently the check will be deposited in a fairly normal manner.
4. If the "system" determines that the branch cannot accept the check, then the magical process with the Foreign Clean Collections department kicks in, and you get the multi-part form, special envelope, a 6-8 WEEK processing time, and hundreds of dollars in fees that you will not be able to determine in advance.
5. The representative claimed that Bank of America only charges a flat $40 for the Foreign Clean Collections process, but that the issuing bank can charge their own fees for having to process the foreign check. In my case, I was charged around USD $150 by the issuing Canadian bank just for the honor of cashing their USD check. There is realistically no way for you to know how much the foreign bank will charge in advance.
6. I asked the representative how I was supposed to accept payments given the uncertainty and fees involved in this process. He told me that they recommend wire transfers for foreign payments, and basically told me not to accept foreign checks.
What a shocking conclusion.
Naturally, I have received several responses from people saying that they accept foreign checks all the time at their bank and never have an issue. Good for you, I say, enjoy the 1900s! The Pony Express loves you!
I rarely receive such checks, don't want to have to drive to the bank to deposit them, and don't want to deal with clueless bank employees and the nightmare game-of-chance process outlined above.
Checks are a vestigial organ of banking and are a testament to the absurdly anachronistic North American banking system. Talk to someone from any country with a modern banking system and ask them how many checks they issue. "Checks? What?" will be the response. People from Singapore and Australia literally laugh in disbelief when I mention that the US still uses paper checks.
Wire transfers have been well established since the late 1800s and now provide same day international funds transfers, usually for a reasonable fixed fee. Credit cards are a defacto payment method for a massive volume of transactions for many countries, and have benefits like fraud protection and points, and the merchant pays the fees for those transactions--which I am happy to do.
And services like the excellent TransferWise provide very low cost EFT funds transfers to dozens of countries with an excellent exchange rate.
The only reason I have to explain why North American consumers and businesses seem to cling to checks is because our backwards banking system does not (yet) charge fees to shuffle around millions of pieces of paper with ink on them, pay the postage to mail them, scan those papers into digital images, and then perform an electronic funds transfer behind the scenes. But they do charge a fee if customers initiate a payment electronically through EFT / ACH or a wire transfer and no paper is involved. It's crazy.
So, after wasting a few more hours researching this topic, I now have a clear decree, straight from the heart of Bank of America, and will continue to accept only credit card and wire transfer payments from non-US customers. If it's good enough for the rest of the world, it's good enough for me.
Several years ago, I received a paper check, in the mail, from a Dynamics partner in Canada. The partner was paying my US dollar invoice, and thought they were doing me a favor by drafting the check from their US dollar bank account at their Canadian bank.
Send a check in US dollars to pay a USD invoice--makes sense, right?
Nosiree.
I attempted to deposit the check at my local Bank of America branch using the ATM. The ATM would not accept the check. So I went inside the bank, stood in line, and then told the teller I wanted to deposit the check. The teller looked at the check, and confusion ensued.
Eventually a manager came over and explained to me, with full confidence, and in no uncertain terms, that they were unable to accept the check. He explained that the problem was not that the check was from a Canadian bank. He said that the problem was that the Canadian check was issued in US Dollars. He claimed that because the country of origin did not match the check currency, the branch could not accept the check. That's the policy. (no it isn't)
So...how can I deposit the check?
The manager handed me a special envelope and a triplicate carbon copy form. He said I needed to fill out the form and mail it with the check to a super special obscure department at Bank of America called "Foreign Clean Collections"--whatever that means. Once the check is received by that department, it will review the check and coordinate with the foreign bank to get the funds transferred. This process will take 6-8 WEEKS.
You're kidding me, right? Nope.
So, being curious about this banking train wreck, I gave it a try. I filled out the form and mailed the USD $1,000 check off to the super special department.
A few weeks later, a deposit shows up in my account for $800. Yup, $200 less than the check. In addition to having to wait several weeks for the deposit, I was charged $200 in bank fees!
After that nightmare, I stopped accepting any foreign checks. I put a big red note on my invoice that says that I only accept credit cards and wire transfers from non-US customers. And guess what: That process has been working just fine for years.
This week, a Canadian partner didn't read my invoice, and didn't read my email with the invoice, and they mailed me a paper check. The check is from their Canadian bank, issued in US Dollars. Great.
So I contacted a colleague who regularly receives Canadian checks, and she said that she routinely deposits Canadian checks issued in USD at her local BofA branch without any issues. Huh.
But having paid my $200 entrance fee to the Bank of America Foreign Clean Collections Club, I wasn't about to just deposit this new check, wait several weeks, and see how much I get charged.
So I did the obvious thing: I called my local Bank of America branch.
First customer service rep: "Sorry, I don't deal with those things. Let me transfer you to our back office." Apparently the back office doesn't have voicemail and is out to lunch at 9am, as the phone rang for 3 minutes with no answer. I tried calling the branch back, but this time nobody answered and I got a voice response system. So the local bank branches are useless when inquiring about these things.
So I then called the main BofA customer service 800 number. I spoke with someone who tried very hard to help, but she was unable to find any information and her computer and phone were unable to contact the department who might be able to help. So she gave me the phone number to the Bank of America Foreign Exchange Call Center.
I then directly called the illustrious Foreign Exchange Call Center and spoke with someone who, for the first time, sounded like he understood the mysterious process of depositing foreign checks with Bank of America.
"Can I deposit this Canadian check drafted in US Dollars at my local California branch?", I asked
"Every check is reviewed on a case by case basis.", he replied
What? What does that even mean?
"Every check is reviewed on a case by case basis.", he replied
So you have no consistent policy about depositing foreign checks?
"Yes, we have a very consistent policy that I just explained to you. Every check is reviewed on a case by case basis.", he replied
After speaking with him for several minutes and apparently annoying him, here is my understanding of the official Bank of America policy / procedure for foreign checks.
1. Acceptance of a foreign check is completely up to the discretion of the BofA branch, and the inconsistent and incorrect training that a teller or branch manager may have received. The branch can simply say they don't accept foreign checks. Or they can conjure up an excuse as to why they can't accept the check, like "the country of origin does not match the check currency".
2. If the branch is willing to try to accept the check, they can scan the check in their "system". This "system" then determines if Bank of America is willing to accept the check at that branch. Apparently this involves super secret algorithms about my "relationship" with the bank, the physical check, the bank that issued the check, the country of origin, the currency, the amount, etc.
3. If the "system" determines that the branch can accept the specific check, apparently the check will be deposited in a fairly normal manner.
4. If the "system" determines that the branch cannot accept the check, then the magical process with the Foreign Clean Collections department kicks in, and you get the multi-part form, special envelope, a 6-8 WEEK processing time, and hundreds of dollars in fees that you will not be able to determine in advance.
5. The representative claimed that Bank of America only charges a flat $40 for the Foreign Clean Collections process, but that the issuing bank can charge their own fees for having to process the foreign check. In my case, I was charged around USD $150 by the issuing Canadian bank just for the honor of cashing their USD check. There is realistically no way for you to know how much the foreign bank will charge in advance.
6. I asked the representative how I was supposed to accept payments given the uncertainty and fees involved in this process. He told me that they recommend wire transfers for foreign payments, and basically told me not to accept foreign checks.
What a shocking conclusion.
Naturally, I have received several responses from people saying that they accept foreign checks all the time at their bank and never have an issue. Good for you, I say, enjoy the 1900s! The Pony Express loves you!
I rarely receive such checks, don't want to have to drive to the bank to deposit them, and don't want to deal with clueless bank employees and the nightmare game-of-chance process outlined above.
Checks are a vestigial organ of banking and are a testament to the absurdly anachronistic North American banking system. Talk to someone from any country with a modern banking system and ask them how many checks they issue. "Checks? What?" will be the response. People from Singapore and Australia literally laugh in disbelief when I mention that the US still uses paper checks.
Wire transfers have been well established since the late 1800s and now provide same day international funds transfers, usually for a reasonable fixed fee. Credit cards are a defacto payment method for a massive volume of transactions for many countries, and have benefits like fraud protection and points, and the merchant pays the fees for those transactions--which I am happy to do.
And services like the excellent TransferWise provide very low cost EFT funds transfers to dozens of countries with an excellent exchange rate.
The only reason I have to explain why North American consumers and businesses seem to cling to checks is because our backwards banking system does not (yet) charge fees to shuffle around millions of pieces of paper with ink on them, pay the postage to mail them, scan those papers into digital images, and then perform an electronic funds transfer behind the scenes. But they do charge a fee if customers initiate a payment electronically through EFT / ACH or a wire transfer and no paper is involved. It's crazy.
So, after wasting a few more hours researching this topic, I now have a clear decree, straight from the heart of Bank of America, and will continue to accept only credit card and wire transfer payments from non-US customers. If it's good enough for the rest of the world, it's good enough for me.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Thursday, September 28, 2017
Back up your Dynamics GP SQL Server databases directly to Azure Storage in minutes!
My blog has moved! Please visit the new blog at: https://blog.steveendow.com/
I will no longer be posting to Dynamics GP Land, and all new posts will be at https://blog.steveendow.com
Thanks!
By Steve Endow
At the GP Tech Conference 2017 in lovely Fargo, ND, Windi Epperson from Advanced Integrators had a great session about Disaster Recovery. One topic she discussed was the ability to use the Dynamics GP Back Up Company feature to save SQL backups directly to Azure.
I think doing SQL backups to Azure is a great idea. There are countless tales of SQL backups not being done properly or being lost or not being retained, and having an option to send an occasional SQL backup to Azure is great.
But this option is a manual process from the Dynamics GP client application, it is not scheduled, and it does not use the "Copy-only backup" option, so the backups will be part of the SQL backup chain if the customer also has a scheduled SQL backup job. So as Windi explained, it may be a great option for very small customers who can reliably complete the task manually on a regular basis.
But how about setting up a backup job in SQL Server that will occasionally send a backup to Azure?
It turns out that the process is remarkably easy and takes just a few minutes to setup and run your first backup to Azure Storage.
NOTE: From what I can tell, SQL backups to Azure are supported in SQL 2012 SP1 CU2 or later. And it appears that the backup command syntax may be slightly different for SQL 2012 and 2014, versus a newer syntax for SQL 2016.
The hardest part is setting up your Azure account and creating the appropriate Azure Storage account. It took me a few tries to find the correct settings.
First, you have to have an Azure account, which I won't cover here, but it should be a pretty simple process. Here is the sign up page to get started: https://azure.microsoft.com/en-us/free/
Once you have your Azure account setup and have logged in to the Azure Portal (https://portal.azure.com), click on the "More Services" option at the bottom of the services list on the left. In the search box, type "storage" and a few options should be displayed.
I chose the newer "Storage Accounts" option (not "classic"). To pin this to your services list, click the star to the right.
I will no longer be posting to Dynamics GP Land, and all new posts will be at https://blog.steveendow.com
Thanks!
By Steve Endow
At the GP Tech Conference 2017 in lovely Fargo, ND, Windi Epperson from Advanced Integrators had a great session about Disaster Recovery. One topic she discussed was the ability to use the Dynamics GP Back Up Company feature to save SQL backups directly to Azure.
I think doing SQL backups to Azure is a great idea. There are countless tales of SQL backups not being done properly or being lost or not being retained, and having an option to send an occasional SQL backup to Azure is great.
But this option is a manual process from the Dynamics GP client application, it is not scheduled, and it does not use the "Copy-only backup" option, so the backups will be part of the SQL backup chain if the customer also has a scheduled SQL backup job. So as Windi explained, it may be a great option for very small customers who can reliably complete the task manually on a regular basis.
But how about setting up a backup job in SQL Server that will occasionally send a backup to Azure?
It turns out that the process is remarkably easy and takes just a few minutes to setup and run your first backup to Azure Storage.
NOTE: From what I can tell, SQL backups to Azure are supported in SQL 2012 SP1 CU2 or later. And it appears that the backup command syntax may be slightly different for SQL 2012 and 2014, versus a newer syntax for SQL 2016.
The hardest part is setting up your Azure account and creating the appropriate Azure Storage account. It took me a few tries to find the correct settings.
First, you have to have an Azure account, which I won't cover here, but it should be a pretty simple process. Here is the sign up page to get started: https://azure.microsoft.com/en-us/free/
Once you have your Azure account setup and have logged in to the Azure Portal (https://portal.azure.com), click on the "More Services" option at the bottom of the services list on the left. In the search box, type "storage" and a few options should be displayed.
I chose the newer "Storage Accounts" option (not "classic"). To pin this to your services list, click the star to the right.
Tuesday, September 26, 2017
Free SFTP file transfer and data export tool for Dynamics GP file-based integrations
By Steve Endow
A somewhat common requirement for file-based integrations between Dynamics GP and external services or SaaS solutions involves uploading or downloading files from an SFTP server (SSH File Transfer, completely different than the similarly named FTP or FTPS). SFTP has some technical quirks, so it is often a hassle for customers to automate SFTP file transfers as part of their Dynamics GP integrations.
Some of those integrations also involve exporting data from Dynamics GP to a CSV file and uploading that data to an SFTP server.
To handle this task, I have developed an application that can export data from GP, save it to a CSV file, and upload it to an SFTP server. It can also download files from an SFTP server. The tool is fully automated, can be scheduled using Windows Task Scheduler, and it includes file archiving, logging, and email notification in case of errors.
If you use Blackline, Coupa, IQ BackOffice, or any other provider or service that requires upload or download of files with an SFTP server, this tool may be helpful. It can be used in place of WinSCP or similar tools that require command line scripting.
I am offering this tool for free to the Dynamics GP community. It can be downloaded from my web site at:
https://precipioservices.com/sftp/
The download includes a user guide and sample configuration file. There are quite a few configuration settings, so please make sure to review the documentation to understand how the settings are used.
If you end up using the Precipio SFTP tool, I would be love to hear about which system or service you are using it with and how it ends up working for you.
I started a thread on the GPUG Open Forum if you want to discuss the SFTP tool:
https://www.gpug.com/communities/community-home/digestviewer/viewthread?GroupId=247&MessageKey=d6f5ce8b-1fdd-4fb1-abcc-9e7e529ce013&CommunityKey=4754a624-39c5-4458-8105-02b65a7e929e&tab=digestviewer
If you have questions or encounter issues, you can contact me through my web site at:
https://precipioservices.com/contact-us/
A somewhat common requirement for file-based integrations between Dynamics GP and external services or SaaS solutions involves uploading or downloading files from an SFTP server (SSH File Transfer, completely different than the similarly named FTP or FTPS). SFTP has some technical quirks, so it is often a hassle for customers to automate SFTP file transfers as part of their Dynamics GP integrations.
Some of those integrations also involve exporting data from Dynamics GP to a CSV file and uploading that data to an SFTP server.
To handle this task, I have developed an application that can export data from GP, save it to a CSV file, and upload it to an SFTP server. It can also download files from an SFTP server. The tool is fully automated, can be scheduled using Windows Task Scheduler, and it includes file archiving, logging, and email notification in case of errors.
If you use Blackline, Coupa, IQ BackOffice, or any other provider or service that requires upload or download of files with an SFTP server, this tool may be helpful. It can be used in place of WinSCP or similar tools that require command line scripting.
I am offering this tool for free to the Dynamics GP community. It can be downloaded from my web site at:
https://precipioservices.com/sftp/
The download includes a user guide and sample configuration file. There are quite a few configuration settings, so please make sure to review the documentation to understand how the settings are used.
If you end up using the Precipio SFTP tool, I would be love to hear about which system or service you are using it with and how it ends up working for you.
I started a thread on the GPUG Open Forum if you want to discuss the SFTP tool:
https://www.gpug.com/communities/community-home/digestviewer/viewthread?GroupId=247&MessageKey=d6f5ce8b-1fdd-4fb1-abcc-9e7e529ce013&CommunityKey=4754a624-39c5-4458-8105-02b65a7e929e&tab=digestviewer
If you have questions or encounter issues, you can contact me through my web site at:
https://precipioservices.com/contact-us/
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Wednesday, September 20, 2017
The 10th and 11th ways you can lose your SQL data...
By Steve Endow
Brent Ozar has an excellent post where he shares 9 stories about how customers lost some or all of their SQL Server data.
https://www.brentozar.com/archive/2015/02/9-ways-to-lose-your-data/
What's great about his stories is that as I read each one, I thought "Yep, I can totally see that happening." A simple oversight, a small mistake, one person making a change without realizing it affected other systems, or simply forgetting to change back a single setting in SQL Server. The one about invalid SQL SMTP settings preventing error emails from going out reminded me of my recent Synology drive failures, as I also had invalid SMTP settings and hadn't received the hundreds of error emails telling me I had a problem--so I am certain that is a common symptom.
While stories about hurricanes, floods, tornadoes, or fires may provide great drama for discussion about disaster recovery, I suspect that there are far more disasters that are caused by a few clicks of a mouse, followed by "Ooops." (or "OH MY GOD WHAT HAVE I DONE???")
I have two data loss stories of my own to add to the SQL data loss lore.
Pulling the Wrong Drive
Many years ago, I was a "business systems consultant" for a Big 6 (at the time) consulting firm and somehow ended up helping a customer with their Solomon IV implementation after their sole IT employee quit. I knew Solomon IV, knew VB, knew SQL, and knew hardware, so I was juggling everything and helping them finish their implementation.
Their Hewlett Packard server that hosted the Solomon IV databases was having some issues with its RAID array. The server had mirrored drives that hosted the database files, and occasionally that mirror would 'break' for no good reason. Windows would mark one drive as inactive, and the server would run on one of the drives until we removed the inactivated drive, reinserted it, and repaired the array. This had happened once or twice before, and I was on site at the customer when it happened again. I checked Windows, checked the array, confirmed the mirror had broken. I then pulled the drive, reinserted the drive, and then started the array rebuild. No problem.
Shortly after that, a user noticed that a transaction they entered that morning was no longer available in Solomon. Then another user. Then another. We eventually discovered that all of the transactions and data that had been entered that day were gone. What happened?
After pondering for a while, I realized what I had done. When the RAID mirror broke, Windows would say that one drive had been inactivated, but it wasn't always clear which drive had been inactivated. You had to poke around to figure out if it was the drive on the left or the drive on the right--I don't remember the process, and it might have even been as high tech as watching to see which blinky light on one of the drives wasn't blinking.
I had either mis-read the drive info or not looked carefully enough, and I had pulled out the wrong drive. The active drive. The one that was working and had been saving the transactions and data that day. After I reinserted the drive, I then chose the 'bad' drive, the one that hadn't been active at all that day, marked it as the primary, and then rebuilt the mirror with the old data from that drive. Thereby losing the data that had been entered that day.
This was pre-SQL Server, so we didn't have transaction log backups, so even if we had a full back up from the prior evening, it wouldn't have helped, as it was only that day's data that was lost. Fortunately, I think it was only mid-day, so the users only lost the data from that morning and were able to reconstruct the transactions from paper, email, and memory.
Ever since I made that mistake, I am extremely paranoid about which physical drive is mapped to RAID arrays or Windows drive letters. If you've built a PC or server in the last several years, you may know that Windows will assign drive letters semi-randomly to SATA drives. And when I had two bad drives in my Synology, I double and triple checked that the drive numbers provided by the Synology did in fact map to the physical drives in the unit, from left to right.
I'm hoping that I never pull the wrong drive again.
Test vs. Production
In Brent's blog post, he shared a story about someone logging into the wrong server--they thought they had logged into a test environment, but were actually dropping databases in production.
I have a similar story, but it was much more subtle, and fortunately it had a happier ending.
I was testing a Dynamics GP Accounts Payable integration script. I must have been testing importing AP invoices, and I had a script to delete all AP transactions from the test database and reload sample data. So I'm running my scripts and doing my integration testing, and a user calls me to tell me that they can't find an AP transaction. We then start looking, and the user tells me that transactions are disappearing. What?
As we were talking, all of the posted AP transactions disappeared. All AP history was gone.
Well, that's weird, I thought.
And then it hit me. My script. That deletes AP transactions. That I ran on the Test database.
But how?
Somehow, I apparently ran that script against the production company database. I was probably flipping between windows in SQL Management Studio and ended up with the wrong database selected in the UI. And the customer had so much AP data that it took several minutes to delete it all, as I was talking to the user, and as we watched the data disappear.
You know that gut wrenching feeling of terror when your stomach feels like it's dropped out of your body? Followed by sweat beading on your brow? That's pretty much how I felt once I guessed that I had probably accidentally run my Test Delete script on the production database. Terror.
In a mad scramble that amazes me to this day, I somehow kept my sanity, figured out what happened, and came up with an insane plan to restore the AP data. Fortunately, the customer had good SQL backups and had SQL transaction logs. For some reason, I didn't consider a full database restore--I don't recall why--perhaps it was because it would require all users to stop their work and we would have lost some sales data. So I instead came up with the crazy idea of reading the activity in the SQL log files. Like I said, insane.
So I found an application called SQL Log Rescue by RedGate Software that allowed me to view the raw activity in SQL Server log files. I was able to open the latest log file, read all of the activity, see my fateful script that deleted all of the data. I was also able to view the full data of the records that were deleted and generate SQL scripts that would re-insert the deleted data. Miraculously, that crazy plan worked, and SQL Log Rescue saved me. I was able to insert all of the data back into the Accounts Payables tables, and then restart my heart.
Thinking back on it, I suspect that the more proper approach would have been do to a SQL transaction log backup and then perform a proper point in time recovery of the entire database. Or I could have restored to a new database and then copied the data from the restore into production. But as Brent's stories also demonstrate, we don't always think clearly when working through a problem.
So when you're planning your backup routines and disaster recovery scenarios, review the stores that Brent shares and see if your backup plans would handle each of them. And then revisit them again occasionally to make sure the backups are working and you are still able to handle those scenarios.
Brent Ozar has an excellent post where he shares 9 stories about how customers lost some or all of their SQL Server data.
https://www.brentozar.com/archive/2015/02/9-ways-to-lose-your-data/
What's great about his stories is that as I read each one, I thought "Yep, I can totally see that happening." A simple oversight, a small mistake, one person making a change without realizing it affected other systems, or simply forgetting to change back a single setting in SQL Server. The one about invalid SQL SMTP settings preventing error emails from going out reminded me of my recent Synology drive failures, as I also had invalid SMTP settings and hadn't received the hundreds of error emails telling me I had a problem--so I am certain that is a common symptom.
While stories about hurricanes, floods, tornadoes, or fires may provide great drama for discussion about disaster recovery, I suspect that there are far more disasters that are caused by a few clicks of a mouse, followed by "Ooops." (or "OH MY GOD WHAT HAVE I DONE???")
I have two data loss stories of my own to add to the SQL data loss lore.
Pulling the Wrong Drive
Many years ago, I was a "business systems consultant" for a Big 6 (at the time) consulting firm and somehow ended up helping a customer with their Solomon IV implementation after their sole IT employee quit. I knew Solomon IV, knew VB, knew SQL, and knew hardware, so I was juggling everything and helping them finish their implementation.
Their Hewlett Packard server that hosted the Solomon IV databases was having some issues with its RAID array. The server had mirrored drives that hosted the database files, and occasionally that mirror would 'break' for no good reason. Windows would mark one drive as inactive, and the server would run on one of the drives until we removed the inactivated drive, reinserted it, and repaired the array. This had happened once or twice before, and I was on site at the customer when it happened again. I checked Windows, checked the array, confirmed the mirror had broken. I then pulled the drive, reinserted the drive, and then started the array rebuild. No problem.
Shortly after that, a user noticed that a transaction they entered that morning was no longer available in Solomon. Then another user. Then another. We eventually discovered that all of the transactions and data that had been entered that day were gone. What happened?
After pondering for a while, I realized what I had done. When the RAID mirror broke, Windows would say that one drive had been inactivated, but it wasn't always clear which drive had been inactivated. You had to poke around to figure out if it was the drive on the left or the drive on the right--I don't remember the process, and it might have even been as high tech as watching to see which blinky light on one of the drives wasn't blinking.
I had either mis-read the drive info or not looked carefully enough, and I had pulled out the wrong drive. The active drive. The one that was working and had been saving the transactions and data that day. After I reinserted the drive, I then chose the 'bad' drive, the one that hadn't been active at all that day, marked it as the primary, and then rebuilt the mirror with the old data from that drive. Thereby losing the data that had been entered that day.
This was pre-SQL Server, so we didn't have transaction log backups, so even if we had a full back up from the prior evening, it wouldn't have helped, as it was only that day's data that was lost. Fortunately, I think it was only mid-day, so the users only lost the data from that morning and were able to reconstruct the transactions from paper, email, and memory.
Ever since I made that mistake, I am extremely paranoid about which physical drive is mapped to RAID arrays or Windows drive letters. If you've built a PC or server in the last several years, you may know that Windows will assign drive letters semi-randomly to SATA drives. And when I had two bad drives in my Synology, I double and triple checked that the drive numbers provided by the Synology did in fact map to the physical drives in the unit, from left to right.
I'm hoping that I never pull the wrong drive again.
Test vs. Production
In Brent's blog post, he shared a story about someone logging into the wrong server--they thought they had logged into a test environment, but were actually dropping databases in production.
I have a similar story, but it was much more subtle, and fortunately it had a happier ending.
I was testing a Dynamics GP Accounts Payable integration script. I must have been testing importing AP invoices, and I had a script to delete all AP transactions from the test database and reload sample data. So I'm running my scripts and doing my integration testing, and a user calls me to tell me that they can't find an AP transaction. We then start looking, and the user tells me that transactions are disappearing. What?
As we were talking, all of the posted AP transactions disappeared. All AP history was gone.
Well, that's weird, I thought.
And then it hit me. My script. That deletes AP transactions. That I ran on the Test database.
But how?
Somehow, I apparently ran that script against the production company database. I was probably flipping between windows in SQL Management Studio and ended up with the wrong database selected in the UI. And the customer had so much AP data that it took several minutes to delete it all, as I was talking to the user, and as we watched the data disappear.
You know that gut wrenching feeling of terror when your stomach feels like it's dropped out of your body? Followed by sweat beading on your brow? That's pretty much how I felt once I guessed that I had probably accidentally run my Test Delete script on the production database. Terror.
In a mad scramble that amazes me to this day, I somehow kept my sanity, figured out what happened, and came up with an insane plan to restore the AP data. Fortunately, the customer had good SQL backups and had SQL transaction logs. For some reason, I didn't consider a full database restore--I don't recall why--perhaps it was because it would require all users to stop their work and we would have lost some sales data. So I instead came up with the crazy idea of reading the activity in the SQL log files. Like I said, insane.
So I found an application called SQL Log Rescue by RedGate Software that allowed me to view the raw activity in SQL Server log files. I was able to open the latest log file, read all of the activity, see my fateful script that deleted all of the data. I was also able to view the full data of the records that were deleted and generate SQL scripts that would re-insert the deleted data. Miraculously, that crazy plan worked, and SQL Log Rescue saved me. I was able to insert all of the data back into the Accounts Payables tables, and then restart my heart.
Thinking back on it, I suspect that the more proper approach would have been do to a SQL transaction log backup and then perform a proper point in time recovery of the entire database. Or I could have restored to a new database and then copied the data from the restore into production. But as Brent's stories also demonstrate, we don't always think clearly when working through a problem.
So when you're planning your backup routines and disaster recovery scenarios, review the stores that Brent shares and see if your backup plans would handle each of them. And then revisit them again occasionally to make sure the backups are working and you are still able to handle those scenarios.
Steve Endow is a Microsoft MVP in Los Angeles. He is
the owner of Precipio Services, which provides Dynamics GP integrations,
customizations, and automation solutions.
Tuesday, September 19, 2017
eConnect error: The target principal name is incorrect. Cannot generate SSPI context.
By Steve Endow
A customer recently encountered this error with a Dynamics GP eConnect integration:
Just before this error was reported, a new version of a custom Dynamics GP AddIn had been deployed, so I got the support call, as the partner and customer thought the error was released to the new AddIn.
But this error is related to the eConnect user authentication with SQL Server, so deploying a new DLL shouldn't have affected that authentication.
I recommended that the customer's IT team check the status of the eConnect windows service on the terminal server and try restarting it. The eConnect service was running, but when they restarted the service, they received a login error.
It seems that some other process on the client's network was attempting to use the Active Directory account assigned to the eConnect service on the terminal server. That other process, whatever it is, apparently has an invalid or old password for the domain account. So it was failing to login and locking the Active Directory account.
Once the account was locked, the eConnect service on the terminal server would begin receiving the SSPI context errors, as its authentication with SQL Server would fail once the account was locked.
The IT team had previously tried to reset the eConnect account password, but it would just get locked out again by the mystery app or process that was still trying to use the same domain account. So I recommended that they create a new dedicated domain account for use by the eConnect windows service on the terminal server.
Once they setup the new domain account and updated the eConnect windows service to use the new account, the problem went away.
However, this morning the error seemed to occur again, but restarting the eConnect service appears to have resolved it. Given this odd recurrence, there may be some other cause or details that may be contributing to the problem.
A customer recently encountered this error with a Dynamics GP eConnect integration:
The target principal name is incorrect. Cannot generate SSPI context.
Just before this error was reported, a new version of a custom Dynamics GP AddIn had been deployed, so I got the support call, as the partner and customer thought the error was released to the new AddIn.
But this error is related to the eConnect user authentication with SQL Server, so deploying a new DLL shouldn't have affected that authentication.
I recommended that the customer's IT team check the status of the eConnect windows service on the terminal server and try restarting it. The eConnect service was running, but when they restarted the service, they received a login error.
It seems that some other process on the client's network was attempting to use the Active Directory account assigned to the eConnect service on the terminal server. That other process, whatever it is, apparently has an invalid or old password for the domain account. So it was failing to login and locking the Active Directory account.
Once the account was locked, the eConnect service on the terminal server would begin receiving the SSPI context errors, as its authentication with SQL Server would fail once the account was locked.
The IT team had previously tried to reset the eConnect account password, but it would just get locked out again by the mystery app or process that was still trying to use the same domain account. So I recommended that they create a new dedicated domain account for use by the eConnect windows service on the terminal server.
Once they setup the new domain account and updated the eConnect windows service to use the new account, the problem went away.
However, this morning the error seemed to occur again, but restarting the eConnect service appears to have resolved it. Given this odd recurrence, there may be some other cause or details that may be contributing to the problem.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Friday, September 8, 2017
Multiple hard drive failures on a Synology NAS: Lessons Learned
By Steve Endow
This is a long post, but I think the context and the entire story help paint a picture of how things can fail in unexpected and odd ways, and how storage failures can be more complicated to deal with than you might expect. I learned several lessons so far, and I'm still in the middle of it, so I may learn more as things unfold.
On Tuesday evening, I received several emails from my backup software telling me that backup jobs had failed. Some were from Veeam, my absolute favorite backup software, saying that my Hyper-V backups had failed. Others were from Acronis True Image, saying that my workstation backup had failed.
Hmmm.
Based on the errors, it looks like both backup apps were unable to access my Synology NAS, where their backup files are stored.
That's odd.
When I tried to access the UNC path for my Synology on my Windows desktop, I got an error that the device could not be found. Strange.
I then opened a web browser to login to the Synology. But the login page wouldn't load. I then checked to make sure the Synology was turned on. Yup, the lights were on.
After several refreshes and a long delay, the login page eventually loaded, but I couldn't login. I then tried connecting over SSH using Putty. I was able to connect, but it was VERY slow. Like 30 seconds to get a login prompt, 30 seconds to respond after entering my username, etc. I was eventually able to login, so I tried these commands to try and reboot the Synology via the SSH terminal.
After issuing the command for a reboot, the power light started blinking, but the unit didn't shutdown. Strangely, after issuing the shutdown command, I was able to login to the web interface, but it was very slow and wasn't displaying properly. I eventually had to hold the power button down for 10 seconds to hard reset the Synology, and then turned it back on.
After it rebooted, it seemed fine. I was able to browse the shares and access the web interface. Weird.
As a precaution, I submitted a support case with Synology asking them how I should handle this situation in the future and what might be causing it. I didn't think it was a big deal.
On Wednesday evening, I got the same error emails from my backup software. The backups had failed. Again. Once again, the Synology was unresponsive, so I went through the same process, and eventually had to hard reset it to login and get it working again.
So at this point, it seemed pretty clear there is a real problem. But it was late and I was tired, so I left it and would look into it in the morning.
On Thursday morning, the Synology was again unresponsive. Fortunately, I received a response from Synology support and sent them a debug log that they had requested. Within 30 minutes I received a reply, informing me that the likely issue was a bad disk.
Apparently the bad disk was causing the Synology to deal with read errors, and that was actually causing the Synology OS kernel to become unstable, or "kernel panic".
This news offered me two surprises. First, I was surprised to learn that I had a bad disk. Why hadn't I known that or noticed that?
Second, I was surprised to learn that a bad disk can make the Synology unstable. I had assumed that a drive failure would be detected and the drive would be taken offline, or some equivalent. I would not have guessed that a drive could fail in a way that would make the NAS effectively unusable.
After reviewing the logs, I found out why I didn't know I had a bad drive.
The log was filled with hundreds of errors, "Failed to send email". Apparently the SMTP authentication had stopped working months ago, and I never noticed. I get so much email that I never noticed the lack of email from the Synology.
The drive apparently started to have problems back in July, but up until this week, the Synology seemed to still work, so I had no reason to suspect a problem.
Synology support also informed me that the unit was running a "parity consistency check" to try and verify the data on all of the drives. This process normally slows the unit down, and the bad drive makes the process painfully slow.
After a day and a half, the process is only 20% complete, so this is apparently going to take 4-5 more days.
So that's great and all, but if I know I have a bad drive, can't I just replace the drive now and get on with the recovery process? Unfortunately, no. Synology support said that I should wait for the parity consistency check to complete before pulling the bad drive, as the process is "making certain you are not suffering data/ volume corruption so you can later repair your volume with no issues."
Lovely. So waiting for this process to complete is preventing me from replacing the bad drive that is causing the process to run so slowly. And I'm going to have to wait for nearly a week to replace the drive, all the while hoping that the drive doesn't die completely.
I'm sensing that this process is less than ideal. It's certainly much messier than what I would have expected from a RAID array drive failure.
But that's not all! Nosiree!
In addition to informing me that I have a bad drive that is causing the Synology to become unusable, it turns out that I have a second drive that is starting to fail in a different manner.
Notice that Disk 6 has a Warning status? That's actually the second bad drive. The first bad drive is Disk 2, which shows a nice happy green "Normal" status.
After reviewing my debug log, Synology support warned me that Disk 6 is accumulating bad sectors.
Sure enough, 61 bad sectors. Not huge, but a sign that there is a problem and it should probably be replaced.
Lovely.
So why did I not know about this problem? Well, even if SMTP had been working properly on my Synology, it turns out that the bad sector warnings are not enabled by default on the Synology. So you can have a disk failing and stacking up bad sectors, but you'd never know it. So that was yet another thing I learned, and I have now enabled that warning.
Correction 1: I remembered that the monthly disk health report shows bad sectors, so if you have that report enabled, and if your SMTP email is working, you will see the bad sector count--assuming you review that email.
Correction 2: A reader noted that new Synology units or new DSM installs apparently do have the Bad Sector Warning notification enabled by default, and set with a default of 50 bad sectors as the threshold. But if you have an existing / older Synology, it likely does not have the Bad Sector Warning enabled.
So, here's where I'm at.
I've fixed the email settings so that I am now getting email notifications.
I'm 20% into the parity consistency check, and will have to wait 5+ more days for that to finish.
As soon as I learned that I had 2 bad drives on Thursday morning, I ordered two replacement drives. I paid $50 for overnight express shipment with morning delivery. Because I wanted to replace the drives right away, right? But that was before Synology emphasized that I should wait for the parity check to complete. So those drives are going to sit in the box for a week--unless a drive dies completely in the meantime.
If the parity check does complete successfully, I'll be able to replace Drive 2, which is the one with the serious problems. I'll then have to wait for the Synology to rebuild the array and populate that drive.
Once that is done, I'll be able to replace Drive 6, and wait for it to rebuild.
Great, all done, right?
Nope. I'll need to hook up the two bad drives and run the manufacturer diagnostics and hopefully get clear evidence of an issue that allows me to RMA the drives. Because I will want the extra drives. If I can't get an RMA, I'll be buying at least 1 new drive.
This experience has made me think differently about NAS units. My Synology has 8 drive bays, and I have 6 drives in it. The Synology supports hot spare drives, so I will be using the additional drives to fill the other two bays and have at least one hot spare available, and most likely 2 hot spares.
Previously, I didn't think much of hot spares. If a drive fails, RAID lets you limp along until you replace the bad drive right? In concept. But as I have experienced, a "drive failure" isn't always a nice clean drive death. And this is the first time I've seen two drives in the same RAID array have issues.
And it's also shown me that when drives have issues, but don't fail outright, they can make the NAS virtually unusable for days. I had never considered this scenario. While I'm waiting to fix my main NAS, my local backups won't work. And this Synology is also backing up its data to Backblaze B2 for my offsite backup. That backup is also disabled while the parity check runs. And I then have another on-site backup to a second Synology unit using HyperBackup. Again, that backup is not working either. So my second and third level backups are not available until I get my main unit fixed.
Do I redirect my backup software to save to my second Synology? Will that mess up my backup history and backup chains? I don't know. I'll have to see if I can add secondary backup repositories to Veeam and Acronis and perhaps merge them later.
Another change I'll be making is to backup more data to my Backblaze B2 account. I realized that I was only backing up some of the data from my main Synology to B2. I'll now be backing up nearly everything to B2.
So this has all been much messier than I would have imagined. Fortunately it hasn't been catastrophic, at least not yet. Hopefully I can replace the drives and everything will be fine, but the process has made me realize that it's really difficult to anticipate the complications from storage failures.
Update: It's now Monday morning (9/11/2017), 5 full days after the Synology was last rebooted and the parity consistency check was started, and it's only at 31%. I did copy some files off of this unit to my backup Synology, which seems to pause or stop the parity check, but at this speed, it's going to take weeks to finish. This long parity processing does seem to be a result of the bad Drive 2, as the parity consistency check recently ran on my other Synology in under a day.
Update 2: Tuesday morning, 9/12/2017. The parity consistency check is at 33.4%. Painfully slow. My interpretation is that any task, job, process, or file operation on the Synology seems to pause or delay the parity consistency check. I have now disabled all HyperBackup jobs, paused CloudSync, and stopped my Veeam backup jobs to minimize activity on the limping Synology. I may turn off file sharing as well, just to ensure that network file access isn't interfering with the parity check process.
I also just noticed that the File Services settings on the Synology show that SMB is not enabled. My understanding is that this is required for Windows file sharing, so I'm puzzled how I'm still able to access the Synology shares from my desktop. I didn't turn it off, so I'm not sure if this is due to the Synology being in a confused state due to the drive issues, or something else. I find it strange that my backup software is unable to access the Synology shares, but I'm able to eventually access them--although they are very slow to come up.
Update 3: Monday, 9/18/2017 - The Saga Continues: After thinking about it, I realized that the parity consistency check was likely triggered because I powered off the Synology before it shut down on its own. At the time, I thought that the unit was hung or unresponsive, but I now realize that it was the bad disk that was causing the shutdown to take forever. The parity check is estimated to take 2-4 years due to the bad drive, so I just shut the unit down to replace the bad drive. It took 30-60 minutes for it to fully power down. If you encounter an issue with a drive that causes the Synology to be slow or seem unresponsive, avoid doing a hard reset or hard shutdown on the unit. Try the shutdown command and wait an hour or two to see if the shutdown eventually completes on its own. This will hopefully allow you to avoid a parity consistency check, which is a major hassle with a bad drive.
Now that I've replaced the drive and powered the Synology on, the parity consistency check is still running, and I'm unable to add the replacement disk to my volume. I've replied to Synology support on my existing case asking them how to cancel the parity consistency check and just add the replacement drive so that it can get started on the volume repair process.
Update 4: 9/18/2017: After replacing the bad drive, I see that the parity consistency check is running much faster and I may not have to cancel it. With the bad drive, the process was estimated to take 2-4 years (yes YEARS) to complete, but with the new drive, it is currently estimating about 16 hours. I'm going to let it run overnight and see how much progress it has made by tomorrow morning.
Update 5: 9/19/2017: The parity consistency check finally completed and the Synology began to beep every few seconds, indicating that the volume was "degraded".
Since the parity check was no longer running, the "Manage" button became active, and I was able to add the new drive to the volume and start the repair process, which was quite simple.
So the repair process is now running and it looks like it will take about 26 hours to complete.
Update 6: 9/20/2017: The repair process appears to be going well and should complete today.
While the repair is running, I plugged the bad drive into my desktop and ran the HGST "DFT for Windows" diagnostic application to test the drive. Interestingly, it is not detecting any problems. On the extended tests, it appears to be hanging, but it isn't identifying a problem.
Final update: 9/22/2017: I replaced the second bad drive and the Synology has repaired the volume. Things are back to normal and working well.
I created RMAs for both of the HGST hard drives and mailed them back, so I should get replacements for those drives, which I'll install in the Synology as hot spares.
This is a long post, but I think the context and the entire story help paint a picture of how things can fail in unexpected and odd ways, and how storage failures can be more complicated to deal with than you might expect. I learned several lessons so far, and I'm still in the middle of it, so I may learn more as things unfold.
On Tuesday evening, I received several emails from my backup software telling me that backup jobs had failed. Some were from Veeam, my absolute favorite backup software, saying that my Hyper-V backups had failed. Others were from Acronis True Image, saying that my workstation backup had failed.
Hmmm.
Based on the errors, it looks like both backup apps were unable to access my Synology NAS, where their backup files are stored.
That's odd.
When I tried to access the UNC path for my Synology on my Windows desktop, I got an error that the device could not be found. Strange.
I then opened a web browser to login to the Synology. But the login page wouldn't load. I then checked to make sure the Synology was turned on. Yup, the lights were on.
After several refreshes and a long delay, the login page eventually loaded, but I couldn't login. I then tried connecting over SSH using Putty. I was able to connect, but it was VERY slow. Like 30 seconds to get a login prompt, 30 seconds to respond after entering my username, etc. I was eventually able to login, so I tried these commands to try and reboot the Synology via the SSH terminal.
After issuing the command for a reboot, the power light started blinking, but the unit didn't shutdown. Strangely, after issuing the shutdown command, I was able to login to the web interface, but it was very slow and wasn't displaying properly. I eventually had to hold the power button down for 10 seconds to hard reset the Synology, and then turned it back on.
After it rebooted, it seemed fine. I was able to browse the shares and access the web interface. Weird.
As a precaution, I submitted a support case with Synology asking them how I should handle this situation in the future and what might be causing it. I didn't think it was a big deal.
On Wednesday evening, I got the same error emails from my backup software. The backups had failed. Again. Once again, the Synology was unresponsive, so I went through the same process, and eventually had to hard reset it to login and get it working again.
So at this point, it seemed pretty clear there is a real problem. But it was late and I was tired, so I left it and would look into it in the morning.
On Thursday morning, the Synology was again unresponsive. Fortunately, I received a response from Synology support and sent them a debug log that they had requested. Within 30 minutes I received a reply, informing me that the likely issue was a bad disk.
Apparently the bad disk was causing the Synology to deal with read errors, and that was actually causing the Synology OS kernel to become unstable, or "kernel panic".
This news offered me two surprises. First, I was surprised to learn that I had a bad disk. Why hadn't I known that or noticed that?
Second, I was surprised to learn that a bad disk can make the Synology unstable. I had assumed that a drive failure would be detected and the drive would be taken offline, or some equivalent. I would not have guessed that a drive could fail in a way that would make the NAS effectively unusable.
After reviewing the logs, I found out why I didn't know I had a bad drive.
The log was filled with hundreds of errors, "Failed to send email". Apparently the SMTP authentication had stopped working months ago, and I never noticed. I get so much email that I never noticed the lack of email from the Synology.
The drive apparently started to have problems back in July, but up until this week, the Synology seemed to still work, so I had no reason to suspect a problem.
Synology support also informed me that the unit was running a "parity consistency check" to try and verify the data on all of the drives. This process normally slows the unit down, and the bad drive makes the process painfully slow.
After a day and a half, the process is only 20% complete, so this is apparently going to take 4-5 more days.
So that's great and all, but if I know I have a bad drive, can't I just replace the drive now and get on with the recovery process? Unfortunately, no. Synology support said that I should wait for the parity consistency check to complete before pulling the bad drive, as the process is "making certain you are not suffering data/ volume corruption so you can later repair your volume with no issues."
Lovely. So waiting for this process to complete is preventing me from replacing the bad drive that is causing the process to run so slowly. And I'm going to have to wait for nearly a week to replace the drive, all the while hoping that the drive doesn't die completely.
I'm sensing that this process is less than ideal. It's certainly much messier than what I would have expected from a RAID array drive failure.
But that's not all! Nosiree!
In addition to informing me that I have a bad drive that is causing the Synology to become unusable, it turns out that I have a second drive that is starting to fail in a different manner.
Notice that Disk 6 has a Warning status? That's actually the second bad drive. The first bad drive is Disk 2, which shows a nice happy green "Normal" status.
After reviewing my debug log, Synology support warned me that Disk 6 is accumulating bad sectors.
Sure enough, 61 bad sectors. Not huge, but a sign that there is a problem and it should probably be replaced.
Lovely.
So why did I not know about this problem? Well, even if SMTP had been working properly on my Synology, it turns out that the bad sector warnings are not enabled by default on the Synology. So you can have a disk failing and stacking up bad sectors, but you'd never know it. So that was yet another thing I learned, and I have now enabled that warning.
Correction 1: I remembered that the monthly disk health report shows bad sectors, so if you have that report enabled, and if your SMTP email is working, you will see the bad sector count--assuming you review that email.
Correction 2: A reader noted that new Synology units or new DSM installs apparently do have the Bad Sector Warning notification enabled by default, and set with a default of 50 bad sectors as the threshold. But if you have an existing / older Synology, it likely does not have the Bad Sector Warning enabled.
So, here's where I'm at.
I've fixed the email settings so that I am now getting email notifications.
I'm 20% into the parity consistency check, and will have to wait 5+ more days for that to finish.
As soon as I learned that I had 2 bad drives on Thursday morning, I ordered two replacement drives. I paid $50 for overnight express shipment with morning delivery. Because I wanted to replace the drives right away, right? But that was before Synology emphasized that I should wait for the parity check to complete. So those drives are going to sit in the box for a week--unless a drive dies completely in the meantime.
If the parity check does complete successfully, I'll be able to replace Drive 2, which is the one with the serious problems. I'll then have to wait for the Synology to rebuild the array and populate that drive.
Once that is done, I'll be able to replace Drive 6, and wait for it to rebuild.
Great, all done, right?
Nope. I'll need to hook up the two bad drives and run the manufacturer diagnostics and hopefully get clear evidence of an issue that allows me to RMA the drives. Because I will want the extra drives. If I can't get an RMA, I'll be buying at least 1 new drive.
This experience has made me think differently about NAS units. My Synology has 8 drive bays, and I have 6 drives in it. The Synology supports hot spare drives, so I will be using the additional drives to fill the other two bays and have at least one hot spare available, and most likely 2 hot spares.
Previously, I didn't think much of hot spares. If a drive fails, RAID lets you limp along until you replace the bad drive right? In concept. But as I have experienced, a "drive failure" isn't always a nice clean drive death. And this is the first time I've seen two drives in the same RAID array have issues.
And it's also shown me that when drives have issues, but don't fail outright, they can make the NAS virtually unusable for days. I had never considered this scenario. While I'm waiting to fix my main NAS, my local backups won't work. And this Synology is also backing up its data to Backblaze B2 for my offsite backup. That backup is also disabled while the parity check runs. And I then have another on-site backup to a second Synology unit using HyperBackup. Again, that backup is not working either. So my second and third level backups are not available until I get my main unit fixed.
Do I redirect my backup software to save to my second Synology? Will that mess up my backup history and backup chains? I don't know. I'll have to see if I can add secondary backup repositories to Veeam and Acronis and perhaps merge them later.
Another change I'll be making is to backup more data to my Backblaze B2 account. I realized that I was only backing up some of the data from my main Synology to B2. I'll now be backing up nearly everything to B2.
So this has all been much messier than I would have imagined. Fortunately it hasn't been catastrophic, at least not yet. Hopefully I can replace the drives and everything will be fine, but the process has made me realize that it's really difficult to anticipate the complications from storage failures.
Update: It's now Monday morning (9/11/2017), 5 full days after the Synology was last rebooted and the parity consistency check was started, and it's only at 31%. I did copy some files off of this unit to my backup Synology, which seems to pause or stop the parity check, but at this speed, it's going to take weeks to finish. This long parity processing does seem to be a result of the bad Drive 2, as the parity consistency check recently ran on my other Synology in under a day.
Update 2: Tuesday morning, 9/12/2017. The parity consistency check is at 33.4%. Painfully slow. My interpretation is that any task, job, process, or file operation on the Synology seems to pause or delay the parity consistency check. I have now disabled all HyperBackup jobs, paused CloudSync, and stopped my Veeam backup jobs to minimize activity on the limping Synology. I may turn off file sharing as well, just to ensure that network file access isn't interfering with the parity check process.
I also just noticed that the File Services settings on the Synology show that SMB is not enabled. My understanding is that this is required for Windows file sharing, so I'm puzzled how I'm still able to access the Synology shares from my desktop. I didn't turn it off, so I'm not sure if this is due to the Synology being in a confused state due to the drive issues, or something else. I find it strange that my backup software is unable to access the Synology shares, but I'm able to eventually access them--although they are very slow to come up.
Update 3: Monday, 9/18/2017 - The Saga Continues: After thinking about it, I realized that the parity consistency check was likely triggered because I powered off the Synology before it shut down on its own. At the time, I thought that the unit was hung or unresponsive, but I now realize that it was the bad disk that was causing the shutdown to take forever. The parity check is estimated to take 2-4 years due to the bad drive, so I just shut the unit down to replace the bad drive. It took 30-60 minutes for it to fully power down. If you encounter an issue with a drive that causes the Synology to be slow or seem unresponsive, avoid doing a hard reset or hard shutdown on the unit. Try the shutdown command and wait an hour or two to see if the shutdown eventually completes on its own. This will hopefully allow you to avoid a parity consistency check, which is a major hassle with a bad drive.
Now that I've replaced the drive and powered the Synology on, the parity consistency check is still running, and I'm unable to add the replacement disk to my volume. I've replied to Synology support on my existing case asking them how to cancel the parity consistency check and just add the replacement drive so that it can get started on the volume repair process.
Update 4: 9/18/2017: After replacing the bad drive, I see that the parity consistency check is running much faster and I may not have to cancel it. With the bad drive, the process was estimated to take 2-4 years (yes YEARS) to complete, but with the new drive, it is currently estimating about 16 hours. I'm going to let it run overnight and see how much progress it has made by tomorrow morning.
Update 5: 9/19/2017: The parity consistency check finally completed and the Synology began to beep every few seconds, indicating that the volume was "degraded".
Since the parity check was no longer running, the "Manage" button became active, and I was able to add the new drive to the volume and start the repair process, which was quite simple.
So the repair process is now running and it looks like it will take about 26 hours to complete.
Update 6: 9/20/2017: The repair process appears to be going well and should complete today.
While the repair is running, I plugged the bad drive into my desktop and ran the HGST "DFT for Windows" diagnostic application to test the drive. Interestingly, it is not detecting any problems. On the extended tests, it appears to be hanging, but it isn't identifying a problem.
Final update: 9/22/2017: I replaced the second bad drive and the Synology has repaired the volume. Things are back to normal and working well.
I created RMAs for both of the HGST hard drives and mailed them back, so I should get replacements for those drives, which I'll install in the Synology as hot spares.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Friday, August 25, 2017
Bug in Dynamics GP eConnect taCreateSOPTrackingInfo: Error 4628
By Steve Endow
I'm working on an import that will insert shipment tracking numbers for Dynamics GP SOP Sales Orders. Seems pretty straightforward.
When I attempt to import the tracking number for an order, I get this error from eConnect.
Error Number = 4628
Stored Procedure= taCreateSOPTrackingInfo
Error Description = The Tracking Number (Tracking_Number) is empty
Node Identifier Parameters: taCreateSOPTrackingInfo
SOPNUMBE = WEB0001
SOPTYPE = 2
Tracking_Number = 1Z12345E0205271688
Related Error Code Parameters for Node : taCreateSOPTrackingInfo
Tracking_Number = 1Z12345E0205271688
< taCreateSOPTrackingInfo>
< SOPNUMBE>WEB0001< /SOPNUMBE>
< SOPTYPE>2< /SOPTYPE>
< Tracking_Number>1Z12345E0205271688< /Tracking_Number>
< /taCreateSOPTrackingInfo>
It seems pretty obvious that something isn't right with this error. Clearly the tracking number is being supplied.
So off we go to debug eConnect.
When we open the taCreateSOPTrackingInfo stored procedure and search for error 4628, we see this gem:
So. If the tracking number parameter has a value, the stored procedure returns error 4628, saying that the tracking number is empty. Genius!
I altered the procedure to fix the if statement so that it uses an equal sign, and that eliminated the error, and the tracking numbers imported fine.
What is baffling is that this bug exists in GP 2016, 2015, and 2013, which is where I stopped looking. I'm assuming that it has existed prior to 2013.
However, I recently worked with another customer who imports tracking numbers for their SOP Orders, but they did not receive this error. Why?
Looking at their taSopTrackingNum procedure, I see that it is an internal Microsoft version of the procedure that was customized by MBS professional services for the customer. The stored procedure was was based on the 2005 version from GP 9, and it does not appear to have the validation code. Because it is customized, it was just carried over with each GP upgrade, always replacing the buggy updated version that is installed with GP.
So some time between 2005 and 2013, someone monkeyed with the procedure, added error 4628, and didn't bother to test their changes. And the bug has now existed for over 4 years.
I can't possibly be the only person to have run into this. Can I? Does nobody else use this eConnect node?
Anyway, the good news is that it's easy to fix. But just remember that every time you upgrade GP, that buggy proc is going to get reinstalled, and you'll forget to update the buggy proc, and it will cause your tracking number imports to start failing.
Carry on.
I'm working on an import that will insert shipment tracking numbers for Dynamics GP SOP Sales Orders. Seems pretty straightforward.
When I attempt to import the tracking number for an order, I get this error from eConnect.
Error Number = 4628
Stored Procedure= taCreateSOPTrackingInfo
Error Description = The Tracking Number (Tracking_Number) is empty
Node Identifier Parameters: taCreateSOPTrackingInfo
SOPNUMBE = WEB0001
SOPTYPE = 2
Tracking_Number = 1Z12345E0205271688
Related Error Code Parameters for Node : taCreateSOPTrackingInfo
Tracking_Number = 1Z12345E0205271688
< taCreateSOPTrackingInfo>
< SOPNUMBE>WEB0001< /SOPNUMBE>
< SOPTYPE>2< /SOPTYPE>
< Tracking_Number>1Z12345E0205271688< /Tracking_Number>
< /taCreateSOPTrackingInfo>
It seems pretty obvious that something isn't right with this error. Clearly the tracking number is being supplied.
So off we go to debug eConnect.
When we open the taCreateSOPTrackingInfo stored procedure and search for error 4628, we see this gem:
IF ( @I_vTracking_Number <>
'' )
BEGIN
SELECT @O_iErrorState =
4628;
EXEC
@iStatus = taUpdateString @O_iErrorState,
@oErrString,
@oErrString OUTPUT,
@iAddCodeErrState OUTPUT;
END;
So. If the tracking number parameter has a value, the stored procedure returns error 4628, saying that the tracking number is empty. Genius!
I altered the procedure to fix the if statement so that it uses an equal sign, and that eliminated the error, and the tracking numbers imported fine.
IF ( @I_vTracking_Number =
'' )
BEGIN
SELECT @O_iErrorState =
4628;
EXEC
@iStatus = taUpdateString @O_iErrorState,
@oErrString,
@oErrString OUTPUT,
@iAddCodeErrState OUTPUT;
END;
What is baffling is that this bug exists in GP 2016, 2015, and 2013, which is where I stopped looking. I'm assuming that it has existed prior to 2013.
However, I recently worked with another customer who imports tracking numbers for their SOP Orders, but they did not receive this error. Why?
Looking at their taSopTrackingNum procedure, I see that it is an internal Microsoft version of the procedure that was customized by MBS professional services for the customer. The stored procedure was was based on the 2005 version from GP 9, and it does not appear to have the validation code. Because it is customized, it was just carried over with each GP upgrade, always replacing the buggy updated version that is installed with GP.
So some time between 2005 and 2013, someone monkeyed with the procedure, added error 4628, and didn't bother to test their changes. And the bug has now existed for over 4 years.
I can't possibly be the only person to have run into this. Can I? Does nobody else use this eConnect node?
Anyway, the good news is that it's easy to fix. But just remember that every time you upgrade GP, that buggy proc is going to get reinstalled, and you'll forget to update the buggy proc, and it will cause your tracking number imports to start failing.
Carry on.
Steve Endow is a Microsoft MVP in
Los Angeles. He is the owner of Precipio Services, which provides
Dynamics GP integrations, customizations, and automation solutions.
Subscribe to:
Posts (Atom)