PDA

View Full Version : Wahsington DC Network Outage



Russell
05-09-2009, 07:42 AM
I understand there were problem with the Washington DC center. Presumably, this affects those on east01. And, these users were moved to Dallas. Curious as to what happens if they have softphones connected to east01. Won't they be dead (if Washington is dead; and, if not dead as the followup message says, wont they cause the previously mentioned issue with registrations - not sure if that's been disabled yet, in which case the softphones will be dead - on two different servers)? Instead can't games be played with DNS so that east01 points to Dallas? Or, does that take too long to propagate? Anyway, if reconfiguring the softphones is upto the consumer, this is all the more reason to make the server name visible in the VPanel. C'mon guys considering that you've created a new ATA Preferences menu where you can actually change values, adding a non-modifiable (read-only) label with the server name should be trivial to implement.

Xponder1
05-09-2009, 12:17 PM
I understand there were problem with the Washington DC center. Presumably, this affects those on east01. And, these users were moved to Dallas. Curious as to what happens if they have softphones connected to east01. Won't they be dead (if Washington is dead; and, if not dead as the followup message says, wont they cause the previously mentioned issue with registrations - not sure if that's been disabled yet, in which case the softphones will be dead - on two different servers)? Instead can't games be played with DNS so that east01 points to Dallas? Or, does that take too long to propagate? Anyway, if reconfiguring the softphones is upto the consumer, this is all the more reason to make the server name visible in the VPanel. C'mon guys considering that you've created a new ATA Preferences menu where you can actually change values, adding a non-modifiable (read-only) label with the server name should be trivial to implement.

This is a valid question for sure. I think for the time being I will just turn X-Lite off.

ptrowski
05-09-2009, 12:47 PM
I am wondering if there are lingering issues as outgoing calls work but incoming calls just were going to my network down forwarding number.

VOIPoTim
05-09-2009, 01:11 PM
I am wondering if there are lingering issues as outgoing calls work but incoming calls just were going to my network down forwarding number.

Everything's smooth with support being basically dead with almost nothing coming in all day.

I don't remember for sure, but I think there were always issues with Dallas for you, right?

I went ahead and moved you back to DC. Your ATA will pick up the change on the next provisioning cycle or reboot. See if that clears things up for you.

We'll be moving everyone that was originally on it back later today.

Russell
05-09-2009, 05:27 PM
We'll be moving everyone that was originally on it back later today.

Can you please post back here or in the announcements thread when this is done?

VOIPoTim
05-09-2009, 05:30 PM
Can you please post back here or in the announcements thread when this is done?

Sure, it's already done.

It was done around 3PM Central. Anyone who was on the DC node before should be back on it.

All this should be transparent to users.

VOIPoTim
05-09-2009, 05:44 PM
I understand there were problem with the Washington DC center. Presumably, this affects those on east01. And, these users were moved to Dallas. Curious as to what happens if they have softphones connected to east01. Won't they be dead (if Washington is dead; and, if not dead as the followup message says, wont they cause the previously mentioned issue with registrations - not sure if that's been disabled yet, in which case the softphones will be dead - on two different servers)? Instead can't games be played with DNS so that east01 points to Dallas? Or, does that take too long to propagate? Anyway, if reconfiguring the softphones is upto the consumer, this is all the more reason to make the server name visible in the VPanel. C'mon guys considering that you've created a new ATA Preferences menu where you can actually change values, adding a non-modifiable (read-only) label with the server name should be trivial to implement.

I wish it were that simple, but from a logistics standpoint, it's not.

We can look at adding some kind of server indication, but it's not really as simple as just throwing it up in vPanel based on past experiences.

What happens is people see East01 or Central01 or something and some immediately assume they should be connected to another server since it may be closer to them. Then we have angry people wanting to be moved around for no reason and really can't use any kind of automated balancing since users will be upset if they're moved even if there is no legitimate reason for it to happen.

In general, their preference will likely be where we end up putting them anyway, but we have to have flexibility to do what we need to do to balance things out and make changes along the way.

For the time being, we've put a fix in to prevent the looping from registering on multiple servers so softphone users should be fine.

We're also in the process of rolling out a new procedure for handling softphones and grandfathered BYOD users in which they have a 2nd set of SIP credentials and are completely isolated from the other SIP servers. We want to wait until this is done before getting into a lot of info about SIP credentials.

I know for most users it wouldn't turn into a problem, but from past experience, we've learned that if we give users too much information, it will cause huge support issues (such as users being confused with a device expires time listed). We don't want it to turn into a game of users wanting to be on specific servers unless they have legitimate problems.

Another similar example....right now we have some users that sniff packets to see how their audio is routed and demand that it's routed closer to them or in a different way even if they're not having issues.

We move users around regularly as needed to balance things, but this one was unique since it was a mass move of out DC. We do have the ability to force things for certain users, but want to leave that as a support option if we feel that it'll resolve an issue instead of it being a user-initiated change.

Ultimately, we need to find the right balance with giving advanced users enough information without it turning into a support nightmare, debating which server is better on forums, etc.

Xponder1
05-09-2009, 06:13 PM
Sounds great Tim. Thanks for the updates.

Brian
05-09-2009, 07:04 PM
Tim, based on your comments, this may not even apply, but would it make sense to have a section in vPanel with softphone information, and to have a server to use for connection information (possibly an alias, like server01, etc that is pointing to east01, central01, etc) dynamically display? That way, it would look like this is just how to use the softphone and hopefully wouldn't cause confusion about not being on the closest servers.


I wish it were that simple, but from a logistics standpoint, it's not.

We can look at adding some kind of server indication, but it's not really as simple as just throwing it up in vPanel based on past experiences.

What happens is people see East01 or Central01 or something and some immediately assume they should be connected to another server since it may be closer to them. Then we have angry people wanting to be moved around for no reason and really can't use any kind of automated balancing since users will be upset if they're moved even if there is no legitimate reason for it to happen.

In general, their preference will likely be where we end up putting them anyway, but we have to have flexibility to do what we need to do to balance things out and make changes along the way.

For the time being, we've put a fix in to prevent the looping from registering on multiple servers so softphone users should be fine.

We're also in the process of rolling out a new procedure for handling softphones and grandfathered BYOD users in which they have a 2nd set of SIP credentials and are completely isolated from the other SIP servers. We want to wait until this is done before getting into a lot of info about SIP credentials.

I know for most users it wouldn't turn into a problem, but from past experience, we've learned that if we give users too much information, it will cause huge support issues (such as users being confused with a device expires time listed). We don't want it to turn into a game of users wanting to be on specific servers unless they have legitimate problems.

Another similar example....right now we have some users that sniff packets to see how their audio is routed and demand that it's routed closer to them or in a different way even if they're not having issues.

We move users around regularly as needed to balance things, but this one was unique since it was a mass move of out DC. We do have the ability to force things for certain users, but want to leave that as a support option if we feel that it'll resolve an issue instead of it being a user-initiated change.

Ultimately, we need to find the right balance with giving advanced users enough information without it turning into a support nightmare, debating which server is better on forums, etc.

Russell
05-09-2009, 07:35 PM
I appreciate your long and detailed response to my post, Tim. I do see your point about a certain section of your users trying to aim for the closest server (I guess I'm guilty of that too, as I requested to be assigned to a certain server within the first few days) and this becoming a support burden and especially in hte context of dynamic load balancing more difficult

Anyway, I'm glad that the the looping issue has been addressed ... as I have a vested interest in softphones at the present time. I have a child who will be spending about 2-3 months volunteering in Central America and I was planning on installing a softphone on her laptop - not sure what kind of Internet connection she'll have. At this moment, I'm not sure I want the softphone to "register" as I don't want our calls also ringing on her laptop (if connected) so the looping issue may be irrelevant in this case.

Which reminds me ... should I open a ticket for the record, so that you guys don't freak out if you find "domestic" calls with my credentials originating in Central America over the summer? Is there some way to "notate" my account to that effect?

usa2k
05-10-2009, 05:36 AM
Perhaps it is a liability having servers with recognizable names that relate to location? Or how would they know to be angry and want to be moved? The marketing would need to be such, that they assume the are kept on a close server, but leave it unverifiable so they don't generally have a cause for concern.

On the flip-side, customers may be with VOIPo because there is a server closer to them. It is too bad a VoIP device manufacturer couldn't add a primary/secondary profile for a service. If the primary number cannot complete, it auto-tries the next profile. (Two different services, or like VOIPo, one service, different servers. The second profile does not register until needed.) I think Asterisk can do that, so its not like you can patent the idea, and have hardware manufacturer provide a custom firmware that if used for other, there are royalties ...

For failover conditions, how would you handle same number, multiple devices? You would need a database that crosses the number to a status from each server. You would then know its registered X times on server A, and Y times on server B. When would a failover number happen? When only one device fails? Or all devices? At present, I expect its when the provisioned device fails.

What if its BYOD? Is there no failover protection? Do you guess which device is the softphone? You could have a place to register each device (by MAC ID makes sense) and a grid of options.

Device#1, MAC ___, Failover if not registered/No Failover (Clone#1)
Device#2, MAC ___, Failover if not registered/No Failover (Clone#2)
Device#3, MAC ___, Failover if not registered/No Failover (SPhone)
Register all SERVER#1/SERVER#2
Fallback all SERVER#1/SERVER#2

If Bold was used like my example, I would use underline to show the active choice being used by VOIPo. Or maybe underline would visible be for a beta group that can help with feedback, and your support team so they can see better the current settings.

(Fallback in effect)
Register all SERVER#1/SERVER#2
Fallback all SERVER#1/SERVER#2

dswartz
05-10-2009, 09:17 AM
The outage at the DC center was disturbing. An entire data center off the area for that long? Something there screwed the pooch :(

burris
05-10-2009, 10:10 AM
The outage at the DC center was disturbing. An entire data center off the area for that long? Something there screwed the pooch :(

I think these data centers have organized a conspiracy against Tim..;)

usa2k
05-10-2009, 12:23 PM
Maybe HostGator is big enough now, to have their own Data Center? :)

That would be something many hosting companies could not brag about!
(Going to need a bigger building!)

VOIPoTim
05-10-2009, 12:49 PM
For failover conditions, how would you handle same number, multiple devices?

Basically it should just fork the call out to all of the registrations and whichever one answers first will get it.

usa2k
05-10-2009, 07:55 PM
Basically it should just fork the call out to all of the registrations and whichever one answers first will get it.

And as long as one device is registered, no failover occurs?
I may have been thinking too far out the box.

VOIPoTim
05-10-2009, 07:59 PM
And as long as one device is registered, no failover occurs?
I may have been thinking too far out the box.

Yeah, failover would only be if none are registered.

Russell
05-10-2009, 08:33 PM
Yeah, failover would only be if none are registered.
I think this makes sense. After all VOIPo is providing phone service to ONE residence. I think of any other use as a nice perk.