Cloud Tech IV: Full Schedule

Cloud Tech is the largest gathering of cloud technologists & engineers in the bay area. Our speakers include the top cloud computing entrepreneurs & experts.

Come join us Saturday, October 6th, from 9am to 6pm. at the Computer History Museum in Mountain View, CA for a full 8 hours of learning directly from great minds sharing their secrets!

Special thanks to our sponsors who made this all possible. They are: CloudStack, Scalr, VMware,Rackspace, HP, DataStax, AWS, Canonical, Puppet, and General Catalyst.

9:00am PDT

Breakfast & Coffee

Saturday April 20, 2013 9:00am - 9:30am PDT
Hahn Auditorium Computer History Museum

Lunch

9:00am PDT

Registration

Saturday April 20, 2013 9:00am - 9:30am PDT
Hahn Auditorium Computer History Museum

Lunch

9:30am PDT

EC2 Architecture Details - the Technology that Powers the World's Largest Cloud

Come listen to Apolak Borthakur, the head of Amazon EC2’s Bay area office, talk about what it takes to run the world's largest cloud, grow it, and staff for it to power the fastest growing organizations on the planet.

Speakers

Apolak Borthakur

Apolak is leading the Amazon EC2 office in Palo Alto. Apolak has over 20 years of experience in software development and management. Prior to joining Amazon, Apolak worked for BMC Software, an enterprise management software vendor, for 12 years. He held a variety of progressively... Read More →

Saturday April 20, 2013 9:30am - 10:15am PDT
Hahn Auditorium Computer History Museum

Web-scale

10:15am PDT

Airbnb's Chronos - a mesos framework for managing complex data pipelines

At Airbnb, we recently released Chronos for building complex data pipelines with dependencies (http://nerds.airbnb.com/introducing-chronos). Chronos allows scheduling jobs in a fault-tolerant and distributed way. It is a Scala framework built on top of mesos, a kernel for the cluster. Mesos is in production use at Twitter and Airbnb and runs on thousands of nodes. This talk will cover the basics of mesos and how we built chronos on top of mesos.

Speakers

Florian Leibert

Tech Lead, Data Infrastructure, Airbnb

Florian was an early engineer at Twitter where he helped build critical infrastructure for doing analytics and search. He was primarily responsible for Twitter’s user search product. After a few years at Twitter, Florian joined Airbnb and built the data infrastructure team. He... Read More →

Saturday April 20, 2013 10:15am - 11:00am PDT
Hahn Auditorium Computer History Museum

Big Data

11:00am PDT

Break

Saturday April 20, 2013 11:00am - 11:15am PDT
Hahn Auditorium Computer History Museum

Lunch

11:15am PDT

One to Many: The Story of Sharding at Box

A step-by-step presentation of how we transitioned Box's web-application stack from a single bottlenecked MySQL database, to a fully sharded MySQL architecture, all the while serving 2 billion queries per day. The focus will be on the incremental steps and best practices that enabled the successful execution of this change, as well as the mistakes made and the lessons learned along the way.

We begin with an overview of our web application architecture both before and after sharding, and discuss our reasons for choosing sharded MySQL as our scaling solution. We then walk through the modifications we made to our ORM layer, including advanced features such as support for cross-shard queries and online moving of data between shards. Finally, we present a detailed description of the technique we developed for migrating live data to shards without downtime, which also supports table by table migration for added flexibility. Throughout the talk, the focus will be on how to make large-scale changes in an incremental fashion, without adversely affecting functionality, and most importantly without downtime.

Speakers

Tamar Bercovici

Staff Software Engineer, Box

Tamar Bercovici is a Staff Software Engineer at Box where she leads the Data Access Team in scaling Box’s database architecture and ORM layer. Prior to Box, Tamar was an early-stage employee at XMPie (now a Xerox company), where she drove the development of the award winning uImage... Read More →

Saturday April 20, 2013 11:15am - 12:00pm PDT
Hahn Auditorium Computer History Museum

Web-scale

12:00pm PDT

Realtime Analytics at Facebook

This talk will cover how Facebook transformed its ETL and Analytics pipeline from daily batch to incremental, near realtime. It will discuss the technology that continuously moves, transforms and loads data from distributed log and sharded mysql db, into Hive data warehouse. HBase is used as underlying storage for incrementally updated table, while the data is exposed as external table into Hive for read processing.

Moderators

Andrew Lee

Speakers

Jun Fang

Eddie Ma

Saturday April 20, 2013 12:00pm - 12:45pm PDT
Hahn Auditorium Computer History Museum

Web-scale

12:45pm PDT

Lunch (Thai)

Saturday April 20, 2013 12:45pm - 1:30pm PDT
Hahn Auditorium Computer History Museum

Lunch

1:30pm PDT

Using MySQL for webscale traffic

MySQL replication strategies for data consistency: a Percona XtraDB Cluster case study, covering

1. synchronous replication
2. supports multi-master replication
3. parallel applying AKA “parallel replication”,
4. automatic node provisioning.
5. primary focus on data consistency

Speakers

Vadim Tkachenko

CTO, Percona

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Ocer. He leads Percona CTO Labs, which focuses on technology research and performance evaluations of Percona and third-party products, designing hardware, lesystems, storage engines, and databases that surpass... Read More →

Saturday April 20, 2013 1:30pm - 2:15pm PDT
Hahn Auditorium Computer History Museum

Web-scale

2:15pm PDT

Deploying Machine Learning and Data Science... at scale. Lessons from Accenture, Best Buy, and Rackspace.

The do's the don'ts and the why's. The enterprise is faced with a large problem understanding the access patterns for exploration, developing, deploying, and maintaining machine learning at scale. In this talk we'll go through some common problems and architecture to support all the phases of data science. We'll also talk about what to lookout for when embarking on your first data science initiative.

Speakers

Nick Kolegraff

Director of Data Science, Rackspace

Nick is the Director of Data Science at Rackspace. Responsible for all things chaos and mass destruction. He also works on data science linux in his free time and hacks poker with GPUs.

Saturday April 20, 2013 2:15pm - 3:00pm PDT
Hahn Auditorium Computer History Museum

Big Data

3:00pm PDT

Break

Saturday April 20, 2013 3:00pm - 3:15pm PDT
Hahn Auditorium Computer History Museum

Lunch

3:15pm PDT

Functional Programming for Optimization Problems with City of Palo Alto Open Data

Data Science has emerged as a field which combines expertise in quantitative analysis and distributed computing, generally as a need to apply algorithmic modeling in large-scale applications. Functional programming approaches such as Cascalog (in Clojure) and Scalding (in Scala) have gained popularity for commercial use cases, due to their efficient solutions at scale and desirable properties for software engineer. In this talk we will review typical use cases real-world applications, as well as consider some of the historical drivers which have caused changes in the industry. We we also review an example application in Cascalog, for a recommender system based on City of Palo Alto Open Data.

Speakers

Paco Nathan

Evil Mad Scientist, Liber 118

Paco Nathan, is a "player/coach" who has led innovative Data teams building large-scale apps for several years. Paco is an O'Reilly author, Apache Spark open source evangelist with Databricks, and an advisor for Amplify Partners... Read More →

Saturday April 20, 2013 3:15pm - 4:00pm PDT
Hahn Auditorium Computer History Museum

Big Data

4:00pm PDT

eBay's use of Hadoop for ETL

At eBay we are using Scala (along with Scalding and Scoobi) for much of our Hadoop based batch processing as well as for doing ETL on the generated data. In this talk I'll go over some of the Scala (and other) technologies we have embraced, talk about why we use the approaches that we do and cover some of the larger lessons we learned along the way. When applicable I'll use actual eBay case studies as illustrative examples.

Speakers

Chris Severs

Chris Severs works in the Search Science applied research group at eBay. He has contributed to the Scalding and Scoobi open source projects and authored an addition to Scalding to provide support for Apache Avro. Prior to joining eBay he was a postdoctoral researcher at The Mathematical... Read More →

Saturday April 20, 2013 4:00pm - 4:30pm PDT
Hahn Auditorium Computer History Museum

Big Data

4:30pm PDT

Why must I use CloudFoundry's Bosh? I just learned Chef/Puppet!

Speakers

Dr. Nic Williams

CEO, Stark & Wayne

Dr Nic Williams is one of the foremost open source developers and evangelists in the Cloud Foundry and BOSH ecosystems. Dr Nic discovered Cloud Foundry and BOSH in April 2011 and has been contributing to it, extensions, BOSH releases, tutorials, and blog posts ever since.Dr Nic is... Read More →

Saturday April 20, 2013 4:30pm - 5:00pm PDT
Hahn Auditorium Computer History Museum

Web-scale