Twitter hopes to improve the reliability of its service by moving into its own, custom-built data center later this year, the company said on Wednesday.
The announcement comes a day after Twitter suffered yet another outage that prevented users from logging in or posting updates to its service. After a fairly good start to the year, Twitter has suffered several outages since June, partly due to traffic spikes during the World Cup but also for other reasons.
Moving into its own data center probably won’t solve all the problems — the issue on Tuesday had to do with a database glitch — but it should help, especially with those related to its fast-growing user base. The company says it has been signing up more than 300,000 new users a day this year on average.
“Keeping pace with these users and their Twitter activity presents some unique and complex engineering challenges,” Twitter said in a blog post Wednesday. “Having dedicated data centers will give us more capacity to accommodate this growth in users and activity on Twitter.”
It will also give it more control over how its networks and systems are configured, and allow it to make adjustments more quickly as its infrastructure needs change.
Web giants like Yahoo, Facebook and Google already have their own data centers, but many smaller companies work with hosting providers that manage their IT equipment for them. Twitter’s provider is NTT America.
Its new data center — the first that Twitter will manage itself — will be in the Salt Lake City area. It will give it “a much larger footprint in a building designed specifically around our unique power and cooling needs.” It will house “a mixed-vendor environment for servers running open source OS and applications,” Twitter said.
It’s unlikely that having its own data center would have prevented Monday’s problems, however. They occurred when the company’s user database, which stores records for more than 125 million users, “got hung up running a long running query.”
Twitter forced a restart of the database, which took 12 hours and then didn’t solve all the problems anyway. So it replaced the database with another copy that was working properly — “in the parlance of database admins everywhere, we promoted a slave to master.”
“We have taken steps to ensure we can more quickly detect and respond to similar issues in the future. For example, we are prepared to more quickly promote a slave database to a master database, and we put additional monitoring in place to catch errant queries like the one that caused Monday’s incidents,” it said.