Monday, May 14, 2012

Prince Building Tech Talk

The building located on Prince Street in Soho, New York is home of some amazing tech companies including 10Gen, ZocDoc, Thrillist, and foursquare. In their first meetup, these four companies each gave overviews of their technologies and also tours of their offices.

  • Eric Milke, core server tech lead engineer of 10gen, represented the company at the talk. He graduated from Cornell University and specializes in C++.
  • 10gen, developer of MongoDB, was founded in 2007 by Eliot Horowitz and Dwight Merriman. The company now has over 100 employees and four offices.
  • MongoDB is a scalable, high performance, open source, NoSQL database. It provides document-oriented storage, dynamic schema, full index support, replication, high availability, auto-sharding, and fast in-place updates.
  • MongoDB's document storage of JSON docs allows for the three main pillars of the company -- availability, scalability, and simplicity.
  • To advocate availability, MongoDB provides easy replication and automated fail-over. Servers are constantly transmitting data between each other. For scalability, there are scale reads and built-in sharding. The system auto-rebalances shards if they grow disproportionally. And for simplicity, the database has simple configurations with few startup parameters, a flexible document model that doesn't force unnecessary normalization, and natural-looking language binding.
  • The server is written in C++ and has extensive use of memory mapped files. 10gen supports drivers for C, C#, C++, Erlang, Haskell, Java, Javascript, Perl, PHP, Python, Ruby, and Scala, but there are many unofficial third-party drivers for other languages.
  • Their future plans include implementing concurrency (yielding and database-level locking), a new aggregation framework, TTL collections, hash shard keys, and improving free list implementation to avoid fragmentation.
  • The perfect job combines the stuff you love to do, the stuff you're good at, and the stuff you get paid to do.
  • Harry Heymann presented foursquare's stack, team structure, vision, and a couple examples of tough technical problems they're trying to solve.
  • foursquare is a "better Yelp than Yelp." It is individualized to the user. If the user likes expensive restaurants, it would recommend higher end places to eat.
  • foursquare currently has logged 2 billion checkins, 20+ million users, and 30+ million places. The company is only three years old but it has 110 employees, 50 of which are server engineers, 12 client developers, 5 product managers, and 6 designers. The company frequently holds hack days.
  • foursquare uses MongoDB for its database.
  • Local ads are the biggest ad market.
  • Their company culture is to hire the best and give them authority to build what they want with little overhead. Developers form teams of two or three people and they conceive and launch features on their own schedule.
  • In the last year, the company has streamlined the signup process, revamped venue searches, added user and merchant newsletters, added networked list and web explore, deeply integrated Facebook, and implemented user tools, menus, and fast checkin flows.
  • Future plans include auto-identifying local experts and surfacing their knowledge to other users (ie. getting sushi experts to give tips about the best local sushi places), letting users following merchant news, scaling upwards to 20 billion checkins, recommending based on GPS, aggregating a huge number of tips, and visualizing a user's history into a beautiful infographic-like diary. And there are always algorithmic improvements to the software.
  • Tools that they use include the Scala programming language, Hadoop, Hive for data analysis, EC2 hardware, and MongoDB.
  • They stopped using Google Maps because it was too expensive and went with OpenStreetMap, a user-driven map system. Comparatively, Wikipedia would bypass Encyclopedia Britanica in five years and they see the potential of the crowd-sourced, open-data map to do the same.
  • They plan to add functionality that promotes super users to have editing permissions to the database.
  • They have one of the nicest offices. The main working area promotes a flat structure. There are no cubicles or offices; all the desks sit together without any divisions between coworkers. There are multiple cameras to the other office in California, all overseeing the entire space. The cameras are on 24/7 and it makes it feel like an extension of the NY office, rather than a completely remote location. When they have inter-office stand-ups, all the team members actually stand up next to the big monitors that are connected to the other office. Whenever they want to speak to a person at the other office, rather than sending an instant message on the computer, they go to the stand up station, unmute the microphone, and ask for the other person.
  • There is an abundance of conference rooms, each named and themed after a foursquare badge. The Herbivore room is filled with plants and rustic furniture, the Socialite room had modern furniture and a nightclub feel, and the Far Far Away room had a wallpaper of a skycam of New York City.
  • In the middle of the office was the Atrium with a ceiling window so employees can get some sun and a small theater space for members to do in-house classes and experimentation. The kitchen was filled with picnic tables and snacks, and had a small stage in the middle for talks. There were also multiple phone booths spread around the office for people to make private phone calls.
  • Mark O'Neill, CTO of Thrillist, ran through the history of the company and the technologies they employ.
  • Thrilist is a website about men's lifestyle and does data recommendations. It has two sister sites -- JackThreads, a members-only online-shopping community for men, and Thrillist Rewards, which runs curated events for its target demographic of young males.
  • Thrillist's stack involves LAMP, MongoDB, and Solr, along with some Ruby and Python. The JackThreads stack includes Drupal and MySQL, and Thrillist Rewards uses Cake.
  • The three pillars of the stack are content, users, and commerce. After Thrillist merged with JackThreads and launched rewards, they began moving to a ervice-oriented architecture.
  • In their new architecture, they have four main services named JITR, Zuul, Artemis, and Content.
  • Future plans include API acceleration, personalization, recommendations, and analysis of user behaviors.
  • Nick Ganju, the CTO of ZocDoc, talked about ZocDoc's technology and building the engineering team.
  • ZocDoc is a website to find local doctors and book appointments online. The average wait time for a doctor is 21 days, but ZocDoc appointments are made within 48 hours.
  • The company only has 35 engineers but plans to double by year end. They run Hackathons in the Hamptons and have a super selective hiring process.
  • ZocDoc is focused on scaling, both in the sense of user traffic and in the sense of developers and code base. Currently, they have millions of lines of code written.
  • Tools that they use include Tortoise, Kiln, and Mercurial. Nick stresses that developers need to move to distributed version control as soon as they start scaling. They also use TeamCity and Jenkins.
  • All of their other tools are Microsoft based, including C#, Visual Studio, IIS7, and SQL Server 2008. They use C# because it is important to use a static typed language when you scale up. C# provides many of the modern programming features such as lambdas, tuples, async, parallel, and dynamic binding.
  • They build features per branch and have automated rule-based code reviews that detect production commits and analyze static code. The code reviews outlaw certain classes like SqlCommand. A Selenium bot runs unit tests every 15 minutes and has the functionality to rollback the code if a committed feature ends up breaking.
  • They constantly run A/B tests for every possible feature and also implement feature flags, a development feature inspired by Etsy. With feature flags, they put lots of Booleans in the code to easily turn on and off features.
  • Developers should not over engineer at a startup. The priority is to develop the product and release it, then go back and refactor when you have the resources. Over engineering your caching system, for example, will delay your product development.
  • ZocDoc traces everything that happens on a page, but the logs can only be read by trusted IPs. This helps them track down bugs that randomly happen every once in a while, otherwise called Heisenbugs.
  • They developed ZocMon, a monitoring tool like Google Analytics that soon will be open-source.
  • They make daily deploys. They code freeze at 9pm, whereby the testers in India spend the rest of the time testing.

No comments: