Tuesday, January 5, 2010

Phasing out 24-56kb MP3 streams

We're going to be phasing out our lower bitrate MP3 streams in 2010, and replace them with aacPlus feeds. 24-32k MP3 streams will become 32k aacPlus streams, which sound so much better than the existing 32k streams. 56k MP3 streams will become 64k aacPlus streams.

Eventually, we'll offer 32k and 64k aacPlus streams for all our channels, 128k MP3 streams for compatibility, and 32kb Windows Media streams.

iTunes 9 now has full support for aacPlus (AAC-HE) streams, and this was the main player that didn't support it. Since 1/3-1/2 of all our traffic is people listening in iTunes, this was something that held us back from making more of a switch to aacPlus before now.

We will also be adding Flash-based streaming this year, which will work well for many people in office settings where they can't install a media player.

The listener numbers for the low bitrate MP3s has drastically fallen over the last year, and I can't think of any reason to keep the low-bitrate MP3 streams running. If you think we should for some reason, leave a comment and let me know.

I'm hoping that we'll get more adoption of the 64kb aacPlus streams which frankly sound as good or better than the 128k MP3 streams.

Happy 2010!

Labels: , , ,

Tuesday, September 8, 2009

Snow Leapord support for aacPlus

I just noticed that the Snow Leopard Quicktime player now plays aacPlus over http via a .pls file right out of the box. If you get info while playing an aacPlus stream, it doesn't say anything special to indicate it's aacPlus: just AC, 2 channels, 22050hz. But it really is playing back as a 44.1 stream (remember that aacPlus synthesizes all audio over 10khz).

Strangely, though, RTSP streams in quicktime are NOT playing back in aacPlus! The are played back only as AAC (and hence sound like they're 22khz files rather than 44.1.)

To try it out, open up http://somafm.com/groovesalad48.pls from within Quicktime Player. You don't get Metadata but you do get the stream in full fidelity.

Now try the RTSP version:

rtsp://64.202.98.91:554/gs.sdp

Also, seems that the new Quicktime X doesn't support QTL files anymore. (This breaks all the quicktime links on the SomaFM site, we can change them to .mov files).

PS- Rumor is tomorrow's announcement of iTunes 9 will include aacPlus playback. That would indeed be exciting is that was the case!

Labels: , , ,

Tuesday, September 23, 2008

Speaking on Panel at AES show in San Francisco 10/5/08

I'll be speaking at The Audio Engineering Society show in San Francisco, Sunday October 5, 2008; 9am - 10:45am. Yes, they cruelly scheduled me for a Sunday morning time slot!

The panel is titled, "Internet Streaming - Audio Quality, Measurement, & Monitoring". I'll mostly be talking about audio quality issues and a bit about monitoring (SomaFM developed a bunch of in-house tools to monitor our streams which work pretty well).

The official description: Streaming has become a provider of audio and video content to the public. Now that the public has recognized the medium, the provider needs to deliver the content with a quality comparable to other mediums.

The Moderator is David Bialik. Panelists include Geir Skaaden, Neural Audio; Skip Pizzi, Radio World; Ray Archie, CBS Radio; Rusty Hodge, SomaFM; and Benjamin Larson, Streambox Inc.

Labels: , , ,

Tuesday, September 2, 2008

Will Comcast's streaming caps impact SomaFM listeners?

I've gotten a lot of questions lately about Comcast's streaming caps, and how they might affect listening to SomaFM?

Comcast's cap averages out to about 770 Kbps continuous average bandwidth usage, or about half the capacity of a T1 line. Or about 6 times the bandwidth required to listen to SomaFM. So you could listen to SomaFM 24 hours a day, 7 days a week and use only about 1/6th of the bandwidth you're allowed to use under the new Comcast rules.

For most users, Comcast's limits won't affect them. The main people who will be affected are those who download and share lots of files. Even people who use lots of streaming video likely won't be affected by these limits.

So as far as listening to SomaFM goes, the limits being imposed by Comcast shouldn't affect you.

Labels: , , ,

Friday, August 8, 2008

SXSW Panel Idea proposals

I have proposed a couple panels for SXSW, which have been

Making Music Sound Better Online: Improving Flow and Presentation: Most music services present music like a jukebox, not a professional DJ. Songs stacked serially, not flowing together for various reasons: tonal balance, loudness levels, speed and intensity. We discuss improving that presentation: automated mixing and segue tools; "harmonic key mixing" tracks; improving sound quality of MP3s and alternative Codecs; audio processing systems keeping subjective loudness and tone consistent.

One thing I want to discuss on this panel is broadcast audio processing, and the FM "loudness wars", and why "loudness" doesn't really matter for internet audio but why consistent audio levels are really important. (It's one of my pet peeves, and there are a lot of big services that don't get that right now.)

Rewriting the DMCA: How to Improve Section 114: This panel will discuss the ugly bits of the Section 114 compulsory license for digital/internet music usage, and what parts are in it for historic reasons that don't apply in todays world; as well as changes that both users of the licenses (webcasters) and content providers (artists, labels) would agree to.

Many people agree that certain aspects of Section 114 are obsolete. Others think the DMCA should provide more protection and compensation for creators. Some want to simplify the rules, and others think it doesn't go far enough.

Imagine we could re-write Section 114 today, knowing what we now know. How would it be different? What would be the same?

Labels: , , , , , ,

Monday, July 28, 2008

San Francisco Music Tech Conference Video

I moderated a panel at the San Francisco Music Tech conference on a bunch of different streaming technology issues.

The plan was for this to be a discussion panel about where streaming technologies are going, and what can be taken advantage of now, and what's coming down the pipe soon. We can also talk about what is really needed, vs what "solutions" that the market is pushing right now.

You can't really see me in this video, only my hands in the left side of the frame!

Left to right:

John Richey - Wireless Music Delivery Expert, Apple (he's half out of the frame, sorrt).

Greg Ogonowski - VP of New Product Development, Orban

Tim Pozar - CEO, Late Night Software and former VP of Engineering, UnitedLayer

Chris Grigg - Head of Standards, Beatnick

Labels: , ,

Thursday, July 10, 2008

iPhone streams updated for 2.0/3G

Until today, our iPhone/iPod Touch streams were only working on the current iPhones with the 1.x software. Now thanks to some testing by Mark Malone at Apple, we've updated our iPhone streams to work with the 2.0 software and the 3G iPhones coming out on Friday. So now our streams work on both old and new iPhones and iPod touches. While I haven't had a chance to test the 3G data network with a new iPhone, you should be able to use the WiFi streams when you're on the ATT 3G network. I'll be interested to see how it works out!

Labels: , , , , ,

Friday, June 20, 2008

We rolled out iPhone streaming today!

After a lot of testing, we rolled out iPhone streaming tonight. I'm still not completely happy with the look of our iPhone mini-site so you might see some changes in the near future, but rather than wait until everything was perfect, I decided to release it now.

So now when you go to somafm.com on your iPhone, you get an iPhone-specific site with links for both EDGE (32-56k) and WiFi (128k) streams.

Labels: , , , ,

Thursday, June 19, 2008

Infrastructure Upgrades

We've been improving our streaming and web infrastructure for the last couple weeks. Not everything has been launched until we can fully test it for a couple more weeks (for example the web site is still running on the old server). We're also installing a backup web server on the East Coast at the facilities of Steadyhost (where we have some streaming servers now). We've been happy with the service provide by Steadyhost, they also provide hosting for some other large internet radio stations such as DI.FM.

Labels: ,

Friday, June 6, 2008

Continued problems with our hosting provider; email and web move to our San Francisco datacenter under way

Regrettably, we're still running services from ThePlanet.com's web hosting facilities until we can migrate everything to 365 Main in San Francisco.

So this morning, I wake up to find that our mail server is unreachable again, and this series of messages on The Planet's service update site:

  • June 6 – 10:00am CDT - We have lost network connectivity to H1. We are confirming the extent of any power loss, and we will be updating shortly.
  • June 6 – 10:05am CDT - Transport for H1 temporarily fell offline and is restored. H1 Phase 2 did not lose power. H1 Phase 1 lost power. We will be updating again shortly.
  • June 6 – 10:10am CDT - The temporary generator powering Phase 1 failed. We switched over to the backup generators that were just brought in. The CRAC units have been powered on, and PDUs are having power restored right now. [THis is the second temporary generator that has failed in the last week at The Planet. Perhaps it is operator error? - Rusty]
  • June 6 – 10:15am CDT - We continue to power PDUs in Phase 1. We will update when all PDUs have been restored.
  • June 6 – 10:20am CDT - Power has been restored completely to Phase 1. Our DC Ops team will be walking through the aisles to confirm all racks are online.
  • Customer Support Overview (June 6, 11:30am CDT): -Technical Support Phone: No Hold Time

From our monitoring, the service went down at 7:40 AM pacific, or 9:40 AM CDT. They were a little slow to notice they lost communications with their data center!

Of course our mail server is still unreachable at 11:44 Pacific, or 1:44 PM CDT, 4 hours after they stated that power has returned.

What really annoys me is that they are stating on their site, "Technical Support Phone: No Hold Time". The reason for this is that they're sending all support calls to their sales people, who don't do much more than tell you they're going to escalate you to Level 2 support, but all those techs are busy and they'll need to call you back. I suspect they did that to reduce their 800 number call expenses, because they had hundreds of customers sitting on hold for 30-40 minutes all the time. They also get to make it look like their response time is much better than it really is.

After waiting 45 minutes for a callback that never came, I called in again. Finally I got them to connect me with a real tech support service, not just the person logging callbacks. I've been on hold with "real" technical support at The Planet for 10 minutes now, trying to get our mail server powered back on.

10 more minutes on hold, and the tech tells me, "Can you go online and submit a reboot request ticket, that will expedite things."

At this point I have no faith of when our mail will be back again.

I'll continue to move our services out of The Planet and to our own servers in San Francisco; our DNS is already moved (although we still don't have the redundant location DNS in place yet); the hardware for the new mail server is setup but the mail services aren't configured yet. There are also a few issues with some of the web services we run; the old systems at the Planet used a much older version of the Berkeley DB software package which isn't compatible with the current versions. So I have make a few changes to our "now playing" code as well as our stream server monitoring systems. The "now playing" database is the most important to our listeners, because that's got all the information on which album songs come from, as well as the info on where to buy the track or get more info on the artist.

The mail server is a bit harder to migrate, but I'm also working on that right now as well.

Hopefully, the good thing that will eventually come out of this is that we'll have redundant servers, in different geographic locations,

Labels: ,

Tuesday, June 3, 2008

Tough Weekend Outage

The company that hosts the webserver for SomaFM.com and the mail server, ThePlanet.net, had a rather large outage last weekend, which took the SomaFM web site off the air (so to speak) from 3:08 PM PDT Pacific time on Saturday, until about 3:37 AM Pacific time Monday morning (June 2nd).

Our mail server is still down, about 72 hours later. More on that in a bit.

The cause of this outage was outage was not immediately known, and calls to The Planet's tech support lines (which had 30 minute waits) were "unrewarding" to say the least. At first they wouldn't give me any information at all (because I didn't have the proper password), and they were only giving out information to "affected customers". I pointed out that since they had caller ID and they knew that I was calling from the phone number on record for our account, that should prove adequate to allow them to give me some information on what was happening. The rep finally agreed, even though he said, "he could get in trouble for telling me this".

What he told me was that they had had a transformer explosion at the datacenter where our servers were located.

This seemed kind of fishy, didn't they have adequate generator power? What about the UPSes? Blown transformers happen fairly frequently, that's one reason you have redundant power systems.

A while later, they made a public announcement about the outage at the Planet's Houston data center:

Today at approximately 5:45 p.m. [central time], a transformer in our H1 data center in Houston caught fire, thus requiring us to take down all generators as instructed by the fire department. All servers are down until power can be restored.
According to our monitoring logs, it was 5:07 PM central time, not 5:45 PM.

We received more information dated May 31 – 10:46pm (8:46 pm Pacific):

On Saturday, May 31st at 4:55pm CDT in our H1 data center, electrical gear shorted, creating an explosion and fire that knocked down three walls surrounding our electrical equipment room. Thankfully, no one was injured. In addition, no customer servers were damaged or lost.

We have just been allowed into the building to physically inspect the damage. Early indications are that the short was in a high-volume wire conduit. We were not allowed to activate our backup generator plan based on instructions from the fire department.
This time makes more sense. Seems like the UPSes did indeed work, but they weren't able to switch over to generator power. So about 10 minutes after they lost power, the UPS batteries were expended, and the facility lost power.

This is also the first time they mention "the short". At first it was just a transformer fire. But now it sounds like it was a transformer explosion caused by an electrical short, which implies that some wires were so overloaded that the insulation melted and caused them to short out.

There have been lots of discussions about the blame for the problems at The Planet. I'm not going to go into that now. However I am less than satisfied at the quality of the communications from them, and not happy with at all how they've handled the situation.

The SomaFM.com web server eventually came back while we were just finishing up restoring our backups to a new web server. (So at least we now have a tested plan and sequence from restoring from backups!)

However, as of 10:30am on June 3rd, our mail server is still not running, nor did it come back up when The Planet said that they had powered back on the part of the datacenter where it is located. After sitting on hold (with very bad music) for 35 minutes, a tech told me that our mail server machine was one of the older ones that would have to be powered on by hand... and that there were over 1000 of these machines that they would be going around and turning on one at a time. But that never happened.

The last update on The Planet's web site was kind of ominous:

This morning at approximately 2:45 a.m. CST, the temporary generator supplying power to the servers and environmental control systems located in Phase 1 of our H1 facility shut down. This was caused by some faulty current sensors in the output breaker. The sensors detected an out of balance current condition that did not exist.

At this point, I don't know when the mail servers will be working again. I guess we have to deploy a new mail server (which is also the secondary DNS server).

Wait! Another update:

Fixing the faulty breaker on the generator powering H1 Phase 1 was not successful. we have located a second generator that is currently being delivered to the facility. It is expected to arrive this afternoon and we will provide additional information regarding the new generator at that time.
That doesn't sound promising. And for all I know, our server has been blown up by a power glitch or something. Time to get working on that new mailserver, I guess!

Unfortunately, I screwed up and didn't properly backup the mail server configs and will have to recreate all that by hand, so it's not a real simple process.

But I guess it won't take too long as I won't have any interruptions from email today!

But now we do have a full backup of the SomaFM web server up and running at our rack in 365 Main's San Francisco data center. And I'm working on getting further redundancy in place so this won't impact our listeners much if it happens again.

You can follow the drama of The Planet on their Service Update web page.

And thanks for your patience with us.

Labels: ,

Tuesday, May 6, 2008

SanFran MusicTech Summit

I'll be moderating a panel on new developments in streaming at the SanFran MusicTech Summit this Thursday, May 8th at the Hotel Kabuki. Our panel will start at 1:50pm in the Osaka Room (the downstairs room behind the Spring Room). With me will be:

John Richey - Wireless Music Delivery Expert, Apple
Greg Ogonowski - VP of New Product Development, Orban
Chris Grigg - Head of Standards, Beatnick
Tim Pozar - VP of Engineering, UnitedLayer

We're going to be talking about delivery methods. New codecs. Streaming to mobile devices. Internet radio hardware devices. How to determine if you really need a content delivery network. It should be real fun.

Here's a blurb about the summit:

The SanFran MusicTech Summit will bring together digital thought leaders from the San Francisco Bay Area, as well as from all around the country to the region which currently leads the way in innovating (both socially, and technologically) new ways of interacting with both music, and musicians. We will be working long term to help enable a sustainable, ongoing, Northern California based music and related technology market.

Register for the Summit here

Labels: , , , ,

Tuesday, April 15, 2008

EVDO, Wireless Performance, Radio Remote broadcasts and violating your terms of service

As I sit here in my Las Vegas Motel Room (the Best Western Mardi Gras, selected only on the basis of price and proximity to the Las Vegas Convention Center, where NAB is taking place) I'm thinking about how bad wireless performance is in general.

Right now, I'm typing over EVDO, because the hotel internet - powered by Lodgenet's StayOnline - is completely dysfunctional ("timeout connecting to network"). This is the same StayOnline that gave us so much trouble at the Marriott in Austin when we were trying to cover SXSW. I guess I should learn never to depend on the in-room wireless internet at most hotels/motels, because if the hotel is busy at all - the network will be unusable.

But, we have an EVDO card! We bring our own bandwidth with us rather than rely on the hotel internet, because that way we can always have internet access over "Sprint's EVDO Rev. A networks with data speeds up to 3.1 Mbps!" Only it doesn't work that way. In fact, these days, we're lucky if we get 500kb down. Here's what I get from the Speakeasy Speed Test:

SafariScreenSnapz001.png
Not too impressive.

You see the problem happens when there are too many EVDO users. And for Sprint (like Verizon), that means all the people that have their multimedia phones. And there are more and more of those out there all the time, fighting for the finite amount of bandwidth at each cell site.

We really saw this last weekend, when we webcast from Yuri's Night. At first, the webcast worked great. We had plenty of speed. But as all the geeks started arriving for the big party that evening, the network started getting slower and slower. By 8pm, that stream (from Stage 2) rebuffered so much it was pretty much unusable.

We were doing the main stage broadcasts from WayneCo's Bus which is equipped with a Motosat satellite internet uplink, which usually only gets about 256kb max for uplinks, unless you pay a hefty additional fee (which uses multiple transponders). So we didn't have enough bandwidth to stream both from the bus, and had to resort to EVDO for the other stage.

waynecobus.jpg

WiFi was also out of the question. With 5 SSIDs visible, the only reliable one was the backhaul network for the ticket booths, and not connected to the internet. The public internet was so overloaded that it often disappeared for minutes at a time. And only once were we able to maintain a connection, and that was before the event started. So we couldn't WiFi between our encoding gear at Stage 2 and the bus.

Everyone is always making promises of the happy wonderful infinite bandwidth wireless future. But it's still a way off. In a crowded situation, WiFi is about as useful as a CB radio, OK for really short distances, but for useful distances (200 feet or more) it falls apart. In this case, we could barely get 40 feet out of the WiFi base station in the bus to a remote machine. Sprint's EVDO works great sometime (4am in the morning in places where there aren't many users, for example) but lately in many different places we've used it, the service is over subscribed and slow, slow, slow.

Verizon's EVDO works just as badly as Sprint.

Wayne de Geere, who graciously provided his bus as our base of operations, has a Verizon EVDO, which worked about as poorly as the Sprint one. We chose the Sprint service because of Verizon's Terms of Service actually prohibit streaming audio and/or video and updating webcams and pretty much anything actually useful you'd do with their service. Sprint's restrictions are pretty much limited to things that violate the law.

Bottom line: we should have brought lots of wire. And run 600 ohm balanced audio from the stages back to where we were. Or installed wired ethernet connections to each stage. Or used something like a Marti SRPT 30 analog remote pickup unit, the technology that terrestrial radio broadcasters have been using for 30+ years.

SRPT_30_MRTPMN.jpg
Low tech, old fashioned, but tried and true.

Labels: , ,

Friday, March 28, 2008

Multicasting from the archives

Me, 5-Mar-99: ``Multicasting. It's close. And it is going to revolutionize internet radio.``

Man, I got that one wrong! Multicasting never caught on, not because the technology was bad but because there were never any business reasons for ISPs to enable multicast support in their backbone routers... there was no financial incentive for ISPs to support it; rather it was something that would ultimately cost them money to support it, and there was no way for them to effectively get paid for carrying the multicast traffic on their networks.

There was (and still is) no settlement model, hence no business incentive for multicasting. That's the main reason it never took off.

Labels: , ,

Monday, March 10, 2008

Remote Broadcasting Lessons Learned

I though it would be great for SomaFM to do a lot of live coverage from SXSW this year- after all, it's one of the biggest music and media festivals in the world. So I had some grand plans, many which have failed so far.

Plan number one:

Live webcams looking at the Six Street club area, as well as looking at the convention center and Brush Square Park. We have the cameras; we secured places to locate them in the Courtyard hotel next to the convention center. Alas, we didn't expect a total failure of the hotel's ethernet network system. And our backup plan didn't expect the hotel's wireless system to become overloaded and crash multiple times a day.

In fact it's a good thing that I have a Sprint EVDO card to get wide area wireless access to the internet, or I'd have no internet connectivity at all. But even Sprint's EVDO network is getting overloaded during peak hours (e.g. 10am-midnight local time). Today, it took about 30 minutes to upload 30 photos, when it should have taken 2-3 minutes. Plan number two: Live "Austin Audio" from above Sixth Street. The street sounds here in Austin have to be heard to be believed. And I though it would be really cool to do a live broadcast of the sound of Austin from the hotel a block off Sixth street. So I brought a couple of portable streaming encoders with us, and some stereo microphones to mount on the balcony outside the hotel room. Except that there were no rooms at the hotel with a balcony, and worse yet, no opening windows in the room! So scratch that plan.

Plan number three:

Quickly edit the podcasts, interviews, and band recordings each night on the laptop and upload before morning. The problem here is that we normally use ProTools for doing all our editing and audio production, but since ProTools won't work on OSX 10.5, I couldn't run it on my laptop. (SomaFM has one dedicated "audio production" machine that runs ProTools and also has the master music library on it; but this machine has to run OSX 10.4.x for ProTools.) I ended up installing a copy of Logic, but since I haven't used it much, there has been a learning curve that I hoped would have been faster. Also, some of the audio processing plugins we use with ProTools weren't licensed for use on a laptop.

Plan number four:

Broadcast the Bay Area Takeover party on Thursday live. OK, this one might still happen but given the way things are going, I'm not expecting it to work. We are still going to try. I used to scoff at the NPR guys when they'd send a crew of 5 people to SXSW to report on it. There are 3 people from SomaFM here, but two of those people (Merin and Elise) are also here on behalf of their day jobs and have to give 2/3rds of their time to that. What we needed was an audio production person to do the first passes at editing all the material we're recording. Maybe next time we'll bring an intern. :-)

In retrospect, here's what I think would have made things work a lot better:

Get a couple EVDO to WiFi+Ethernet routers, so we don't have to rely on internet from the hotels or venues at all. And bring plenty of ethernet cabling, as you may not be able to rely on the wireless networks.

Bring at least one extra laptop for processing pictures and audio.

Bring extra batteries for the digital recorders!

Arrive a couple days early to test everything... arriving 24 hours before the event starts won't give you enough time to get everything ready.

Have one person who's job it to just provide production support - and who doesn't go to the panels and conference itself - someone whose only job is to get the stuff posted and edited.

I wonder how this list will change in the next few days... we're basically winging it at this point.

Labels: , ,

Thursday, October 18, 2007

Internet/Telecom in Iceland

I can't make a good judgement call about the net in Iceland, because I have a weak WiFi signal at the hotel, so I'm not sure how much of the slow net is due to the WiFi, the connection of the Hotel, or the connection of the continent.

I did notice that .is sites come up faster than US .com sites... and my photo upload speeds to Flickr seem to be in the 10-15k range a second at best.

However, the internet is everywhere here. Tons of cafés have free WiFi, and every café has people working on laptops in them.

Until I can test speeds at a few more places, I can't really attest to the speed of the net here.

The GSM network supports EDGE but because of the high data roaming charges, I haven't tested it. The cell coverage is pretty impressive, it works everywhere I've been since getting here.

Labels: , ,

Tuesday, July 24, 2007

San Francisco power outage, SomaFM outage

There was a power outage affecting downtown San Francisco today, which also caused an outage at SomaFM's primary datacenter, 365 Main. Note that we've been there for about 2 years now, and this is the first power outage that's affected us. They had another outage right before we moved in, due to a faulty fire alarm which cut power to most of the building.

Now, a "world class datacenter" is supposed to have all sorts of redundant systems in place. And they did. But a slightly unusual series of events proved that even with all that redundancy, things can go very wrong. Here's what really went down at 365main as far as I can tell:

365 Main, like most facilities built by Above.net back in the day, doesn't have a battery backup UPS. Instead, they have a "CPS", or continuious power system. What they are is very very large flywheels that sit between electric motors and generators. So the power from PG&E never directly touches 365main. PGE power drives the motors which turn the flywheels which then turn the generators (or alternators, I don't remember the exact details) which in turn power the facility. There are 10 of these on their roof (or as they call it, the mezzanine; it's basically a covered roof). These CPS units isolate the facility from power surges, brownouts and blackouts.

The flywheels (the CPS system) can run the generator at full load for up to 60 seconds according to the specs.

There are also 10 large diesel engines up on the roof as well, connected to these CPS units. If the power is out for more than 15 seconds (as I recall, I could be wrong on the exact time), the generators start up, and clutch in and drive the flywheels.

There is a large fuel storage tank in the basement, and the fuel is pumped up to the roof. There are smaller fuel tanks on the roof as well, with enough capacity to run all the generators until the fuel starts getting pumped up to the roof.

Here's what I suspect happened:

It was reported there were several brief outages in a row before the power went out for good, so I bet the CPS (flywheel) systems weren't fully back up to speed when the next sequential outage occurred. Since several of these grid power interruption happened in a row, and were shorter than the time required to trigger generator startup, the generators were not automatically started, BUT the CPS didn't have time to get back up to full capacity. By the 6th power glitch, there wasn't enough energy stored in the flywheels to keep the system going long enough for the diesel generators to start up and come to speed before switching over.

Why they just didn't manually switch on the generators at that point is beyond me. (I bet they will next time!)

So they had a brief power outage. By our logs, it looks like it was at the most 2 minutes, but probably closer to 20 seconds or so.

So it looks like the diesels did cut over, but not before the CPS was exhausted in some cases. The whole facility did not lose power I'm told, just most of it.

Here's the letter their noc sent to customers about this:

This afternoon a power outage in San Francisco affected the 365 Main St. data center. In the process of 6 cascading outages, one of the outages was not protected and reset systems in many of the colo facilities of that building.

This resulted in the following:

- Some of our routers were momentarily down, causing network issues. These were resolved within minutes. Network issues would have been noticed in our San Francisco, San Jose, and Oakland facilities.

- DNS servers lost power and did not properly come back up. This has been resolved after about an hour of downtime and may have caused issues for many GNi customers that would appear as network issues

- Blades in the BC environment were reset as a result of the power loss. While all boxes seem to be back up we are investigating issues as they come in

- One of our SAN systems may have been affected. This is being checked on right now

If you have been experiencing network or DNS issues, please test your connections again. Note that blades in the DVB environment were not affected.

We apologize for this inconvenience. Once the current issues at hand are resolved, we will be investigating why the redundancy in our colocation power did not work as it should have, and we will be producing a postmortem report.

Lots of companies were affected. There was a huge line to get into the data center. It was definitely the most people I've ever seen there!

Labels: , ,