On Wednesday, January 30, 2013, OLC attended Prince Building Tech Talks Meetup hosted by ZocDocs and featuring Ryan Nitz of 10gen and Ana Hevesi of Nodejitsu.
Nitz started as a consultant on a browser-based IDE for PaaS and he soon realized the truly rapid development at Mongo. He said that 10gen raised $1.5 million from Union Square Ventures and placed Albert Wenger on the 10gen board. The datacenter went online and people used the production PaaS until they needed drastic expansion and added dev support for PaaS.
They ran into trouble very quickly—Google released their Google App Engine in April of 2008. "10gen bit off more than they can handle," Nitz said. "They were working on custom database and custom apps at the same time, so 10gen pivoted and focused just on MongoDB. They shut down PaaS and stopped development on Babble app server."
In February 2009, Nitz started to play with MongoDB and submitted bugs and feature requirements to 10gen. He listed the features required: ability to tune cursor size and the number of documents returned or queried, binary data type in the shell, MD5 data type and JMX in Java driver. In the early days, MongoDB suffered from global lock, "which was a huge problem," sharding wasn't available, AWS did not have guaranteed IOPs—"there was no journaling and it was similar to MyISAM storage engine in this respect, thus data corruption and fsck-style repairs necessary on a hard kill or crash. This was in particular when not using replication," Nitz said. He also said he didn't know about locking database for back ups.
Regarding vertical scaling, the system scaled up to 5,000 updates per second, but more capacity was needed. "All vertical ceilings reached max AWS instance type," Nitz said. "Sharding is supposed to be released by the end of 2009, but we needed it as soon as possible." The fixes that were eventually applied were: some data were offloaded to S3, updated for document batched in Cor, some data were offloaded to Solr, moved away from skip/limit, moved away from count to recurring and instated hatch jobs that slowly worked the cursor. On the database side, "we decreased fsync frequency, bound db to specific CPU and allowed for frequent manual compaction," he said.
The standing alpha version of the database was released buggy. "It didn't evenly distribute data properly," Nitz said. "I used a bad shard key, which made it all random and we had to wait for further stability, but in the end, someone else finally took over the development of the system [because his contract ran out]." Nitz, at 10gen, started consulting in December of 2010. He designed and implemented MongoDB Monitoring Service (MMS) and launched a beta in about four weeks. "The original MMS agent was in Python," he said. "PyMongo was fairly immature at the time and the replica set had bugs in it and memory leaked, but now PyMongo is in an excellent state," Nitz said.
"I learned a lot of lessons from this," Nitz said. He emphasized application, infrastructure and database instrumentation as essential. "Code changes over time," he said regarding application instrumentation. "That change will impact the database. Same thing for load/usage changes." On infrastructure instrumentation, "Cloud providers have issues, external networks have issues and hardware fails," he said.
"Tune write concerns carefully based on operation," Nitz said. "App/system functionality must be degradable to herd infrastructure issues." He added that schema design is critical. "Never stop looking for improvements. MMS database rrd schema has gone through much iteration. Refactor cable when new MongoDB functionality is released," he said. "Schedule batch jobs wisely, tune cursor sizes on batch jobs to minimize impact on a system. System alerts are essential too. Put proper signals-to-noise ratio and know your scaling points," he said. "Calculate IOP requirement property and backup your data regularly." Nitz also recommended that when scaling, people should tune app/systems and run SSDs."
Ana Hevesi presented nodejitsu, a hosting platform for node.js. "Nodejitsu was founded in New York City," she said. "It is the first company to build entirely in node. It is also the maintainer of over two dozen open-source projects."
Nodejitsu works with a completely distributed team. Its hiring practices include encouraging candidates by "first patching our project on Github." The candidates start by "helping us with support on IRC. We make people rpove that they believe in our cause first, then employment second. This gives us access to 'off the grid' talent."
After the initial process, the candidates start as support engineers. "When they're not with customers, they build and they eventually move from support to development." Because of their remote hiring practices, Hevesi says she hasn't met over half of her team. "They're grouped in small clusters around the world. We instead communicate through IRC, Skype and Github."
To build a remote team, Hevesi recommends that companies invest in open source. "Also, be okay with communicating over and over. Stop being attached to face-to-face as the ultimate way to share banter and convey important stuff. Investing in the rapport when things aren't burning will let your put out fires faster when they are. Schedule IRL time and senior team members need to be good mentors."
The big picture lessons Hevesi learned was to let people be themselves. "Let them be themselves to a level that might scare you a little," she said. "This gets you more from your team." She also said, "If you invest in a remote team, distance and diversity is an advantage. Let things look a little strange."
December's Prince Building Tech Talks Recap: http://www.officeleasecenter.com/articles/december-10th-2012-prince-building-tech-talks-meetup-bootstrapping-your-startup-in-the-cloud.html