November 12th, 2012 New York City MySQL Group's AppNexus Ad Platform Event

On Monday, November 12th, 2012, OLC attended The New York City MySQL Group's AppNexus Ad Platform Event: Building For Scale with NoSQL, MySQL, Hadoop and more. Mike Nolet, CTO and co-founder of AppNexus gave the talk and answered questions from the audience.

AppNexus scaled from zero to 500k queries per second in three years. It gets a hit of 37 billion ad impressions daily, recording up to 20 terabytes of data a day. The company first started focused on cloud. "We thought we could build a cloud platform and services that provide a nexus for users and tech providers," Mike Nolet said. "Our long-term sustainable value was to build this cloud where we get companies together for low latency transactions." Nolet learned that building a cloud platform is easy. "All you need is a server, network, Perl and Xen to create a cloud."

AppNexus was incorporated in 2007 and "we got our first paying customer in 2008." Nolet explained that cloud required infrastructure. "It requires capital — startups require capital. There was no credit, no capital and a recession in 2008. That meant no cloud. This led to AppNexus," he said. The infrastructure of the company transformed to an ad nexus — a real-time selling system. It did an auction in real-time. "We would syndicate an ad request and help publishers make money," Nolet said.

"Platforms need killer apps," he said. In December 2009, AppNexus released their first real-time ad and their product skyrocketed. "We had to build a buyside to sellside, which fueled the ecosystem," Nolet said. AppNexus' seven-day QPS average was at 6.5 million sent per second and Nolet admitted that this was an outdated data. "Our financiers were totally in shock when they saw this number," he said.

"The difference between Google and AppNexus is that publishers license projects for in-house purposes. Companies like Microsoft licensed technology to create their own ad service. It all basically competes with Google," Nolet said. AppNexus goes through a lot of volume, recording 50 gigabytes out and 20 gigabytes in a day. "We laid down 2,000 servers this year," Nolet said, emphasizing the amount of space required.

AppNexus works by the user accessing the website, which hits the impression bus and both the imp bus and decision engine talk with one another and send cookies to the "Cookie Monster," which updates cookie data. The Cookie Monster also has two different servers which stores basic information: segment information and what ads the user has seen. "Cookies are the weirdest thing," Nolet said, "it deletes itself sometimes. It's like they have a mind of their own." He added, "People who claim they know the exact number of unique users are lying, because cookies sometimes duplicate themselves rendering counting absolutely irrelevant."

AppNexus is horizontally scaled. "Our number one priority is to keep running ads even if one of our hardware fails. It's not in our best interests for a website to load 404s or experience hangtime," Nolet said. He explained the function of an impression bus. "It manages accounting logic," he said. "Everything in this architecture allows everything to run smoothly."

Nolet tackled the issue of SSDs — Solid-state drives — and explained that AppNexus went through them every 6-8 weeks. "Our write-load to SSDs were so high it would kill our drives 6-8 weeks. We're looking at Fusion-io drives now.... To see the lifespan of the SSDs is just mindboggling."

Nolet discussed further about AppNexus' infrastructure and outlined Packrat, which is where all apps stream log-level data in real-time to a local cluster of Packrats: 1) Simple http post requests with snappy compression. 2) Streaming can be easily compressed if wan on disk becomes a bottleneck. 3) Only a few thousand lines of code (around 3k).

Mike Nolet reiterated that AppNexus goes through a mountain-load of data a day. "We process over 200,000 packets of data daily. Our Data Management Framework (DMF) directs data to one or more target data stores depending on the log file," he said.

The DMF synchronizes with Netezza, Vertica and Hadoop. "We transform our data into Hadoop and store by user ID every hour," he said. Thirty days of data is equivalent to 300 billion records of ad impressions (30 days because it is a business rule). The transition from Netezza to AppNexus took nine months to develop because Hadoop had virtually no good support systems.

Nolet outlined why they chose Hadoop. "Netezza is much faster out of the box and it is also plug and go, but it doesn't scale horizontally. It's expensive and we couldn't debug it. Hadoop took nine months to get into production at scale. We tune hardware and hired new workers. From a cost-modeling perspective, it was positive and met performance requirements." AppNexus has 1.2 petabytes worth of clusters today. It is estimated to triple next year. "we have enough headroom for the next two year, we could shard clusters if we have to," Nolet said.

He also gave the pros and cons of the three platforms. Netezza is incredibly fast, but it cannot perform high queries for an extended period of time. It is simple and easy to use, but it is expensive and scales unpredictably. Vertica performs fast queries on aggregated data, but it also scales unpredictably. It is great at small queries at high concurrency and has a flexible pricing model, but its writes and updates are expensive and it is a pain to get installed or configured. Hadoop is a "flexible sledge hammer" and scales predictably. It is cost-effective and has a rich open-source ecosystem although it is very labor intensive and has no support options out there. AppNexus made the decision to transfer from Netezza to Hadoop because Netezze could not handle the load.

To conclude the talk, Nolet referred back to AppNexus' inception. "We started naturally agile, but we moved out of it," he said. "But we're actually migrating back to agile."