Engineering Scale & Transport

The Scale & Transport team allows Quantcast engineering to process peta-scale data cost effectively and ergonomically.

Our team owns Quantcast’s peta-scale data processing (20+ PB processed daily), its data transport system (100+ TB transferred daily), and its real time event collection system (250K requests/second).

We also own the QFS open source project.

See open roles

Some crazy numbers

20+ PBs processed per day by our custom mapreduce system.

15+ PBs stored in our distributed file system (QFS).

250K requests/second handled by our real time event collection system.

8MM requests/second handled by our distributed key-value store.

100+ TBs transferred by our data transport system daily

Production Direction

Our team is made up of strong distributed computing experts. We have a number of advanced degree holders and seasoned systems engineers. We know systems at a low level, and leverage algorithms, data structures, and distributed computing expertise to produce performance and reduce cost at scale.

Because of our data scale, our team runs into real theoretical computer science problems from time to time. We invented a data structure to help detect dropped and duplicate messages, for instance. We also run at such a large scale that performance improvements can save literal millions of dollars, which gives us plenty of meaty problems to work on.

Right now we’re investigating how to bring our performance and scale advantages into the modern world of Spark. We’re also investigating moving some of our batch computation into realtime. Both of these are challenging because of how cost-effective our existing infrastructure is.

See What Our Engineers Work On

Nabil Zaman

Nabil developed a novel set data structure that efficiently tracks large quantities of (mostly) sequential values. Instead of storing the set entries individually, they are grouped into closed intervals with new insertions either falling into an existing interval or creating a new one.

By tagging messages passing through our large scale data streams with sequential IDs, we’re able to leverage that data structure to monitor or improve the integrity of our data. We identify data loss in real-time by counting the gaps in the sequence, and eliminate duplication errors altogether.

Nabil on LinkedIn

Mehmet Can Kurt

Mehmet introduced improvements to our cluster resource allocation algorithm so that priorities of jobs that have waited too long to get resources are adjusted on the fly and they are moved to the front of the queue.

This helped us address the starvation issues in Quantflow MapReduce cluster, and make sure that SLAs in our stacks are not missed even when the cluster is under heavy demand.

Mehmet on LinkedIn

Culture

Being new to the team is always going to be like drinking from a firehose; our systems are huge and custom and there are a whole lot of them. We do have a 24×7 on call that rotates through each team member. You’ll expect to be on call one week a month. However, we work hard to make sure that you have a healthy work-life balance. Team members rarely work more than 40 hour weeks. People get into work 11:30am if they want and encourage people to use their vacation time. People are more productive if they work less.

Quantcast as a company, is highly engaged and social. Everyone is encouraged to be themselves at work. Individual contributors affect the company in big ways. Our annual company-wide hackathon, Quantathon, produces projects that change the way the company works every year. Our leaderships team is motivated, passionate, engaged, and approachable.

Interested in joining Quantcast?