QCon San Francisco Recap
21/Nov 2017
The big day has finally arrived. I’m here at QCon, a conference I have been tracking for a long time almost since the beginning of my career. Now, almost 10 years after, thanks to my employer, I’m finally able to attend one in person!
In this blog post, I’m going to recap these three talk-packed days and record my learnings here.
Key Note - How to make a spaceship (Julian Guthrie, Dan Kreigh)
The keynote is given by Julian Guthrie and Dan Kreigh who together wrote a book How to make a spaceship. This is a story about Peter Diamandis and Burt Rutan’s journey to build SpaceShipOne which pioneered the private space flight industry. It’s a message about passion and enterpreneurship, thinking big and defying convention. I think it’s a fairly good start to the conference with a pretty solid inspirational message.
Vision & Strategy - Epiphanies of a Netflix Leader (Josh Evans)
The first talk I attended was about vision and strategy by Josh Evans, a former Netflix manager. I enjoyed the talk very much. It provided great information on having a vision and thinking strategically, which allowed Netflix to become such a dominant force in the market.
Notes
Think strategically:
- Have a vision
- Vision brings clarity and focus
- Be a mountain climber, do not optimize for local optima, instead, look for global optima aka “the mountain”.
Attitude:
- Begin with end in mind
- Be proactive
- Prioritization along urgency-importance axis (4 quadrants)
Business Strategy
A route to continuing power in significant markets - Hamilton Helmer
Power brings both Benefit and Barrier - Benefit for yourself and barrier for your competitors.
Seven Powers
- Scale economies (e.g., Amazon)
- Network economies (e.g., LinkedIn)
- Counter-positioning (e.g., Vanguard)
- Switching Costs (e.g., iTunes)
- Branding (e.g., Tiffany)
- Cornered Resources (e.g., Patents, Pixar)
- Process Power (e.g., Toyota)
The Anatomy of a Distributed System (Tyler McMullen)
Tyler McMullen is the CTO of Fastly. At work I have come in contact with distributed system on a daily basis but have not yet been fully exposed to the aspects of a distributed system, so this is a very informative session for me personally.
Notes
Concepts
- cluster membership
- fault detection
- rendesvous hashing
- gossip
- CRDT
Problems
A health check service (multi-instance) that run in data centres that check customer origins and make sure the load balancing does not route requests to an original that’s down.
Cannot check all origins all the time - need to assign certain origins to certain health check services => distributed systems problem.
Total ordering => Too strong consistency for this system
In this system, it needs to be able to make forward progress => eventual consistency
Ownership and Rendesvous hashing
h(S, O) = W
h = hash function S = the service nodes O = origin W = weight
output:
priorities(O1) = [S2, S3, S1, S4, ...]
Failure Detection SWIM (Scalable Weakly-consistent Infection-style Process Group Membership)
1 -> 3 up? if 3 is not up, 1 -> 2, 1 -> 4: is 3 up?
Don’t DIY, use: hashicorp memberlist
Gossip for communication
Convergence
Do not need total ordering (for strong consistency). Only need partial ordering (account for concurrent events or events with no causal relationships). Lattices - two lattices can always be merged - can be used to model causality and apply partial ordering - use a version vector.
Delta-state CRDT Map
Conflict-free Replicated Data Type. There’s a talk on Day 3 which delves a little bit deeper into this topic.
Delta lattice are CRDT that are sent to the nodes of the cluster to bring consistency to the cluster.
Paper: delta-CRDTs: Making delta-CRDTs delta-based
Why distributed systems?
Single System Image - Present a single face to the end user - System is linearizable
but the real world is more in tune with partial ordering.
Speaking up without freaking out (Matt Abrahams)
This was first talk after lunch. Normally this is the sleepiest time of the day, yet I was fortunate enough to pick the most engaging talk of the whole conference to go to. This talk was absolutely amazing! Communication is important in the success of your professional life. It’s something I have struggled and speaking confidently is not something that comes naturally to a lot of us. Matt’s talk provided a lot of practical advice on how to combat nervousness, organize your material and deliver an engaging presentation.
Notes
3C’s of communication
- Confident: “Confidence equals competence” or the appearance of competence.
- Connected: Make your content relevant to your audience
- Compelling
Manage Anxiety
Anxiety is natural. Acknowledge it and learn to manage it.
Manage the symptoms
- Take deep breath - slows down automatic nervous system
- Hold something cold in the palms of your hand
- Big muscle movement, e.g., standup, open your arms - “welcome everybody”
- Drink something warm
Manage the sources
- Mindfulness - greet the anxiety
- Share the focus - give your audience something to do (focus on something other than you)
- Plan for contingencies - be prepared so you don’t feel nervous that something will go wrong
- Be present-oriented (in the moment) - put your attention in the present; don’t worry about negative outcome. Do something physical, listen to music, count backwards from 100, say tongue-twisters.
Non-verbal presence to boost your confidence
- Don’t stand like a penguin or a duck
- Point your feet forward to eliminate swinging or rocking
- Hands down on the side - have an “open” posture
- or hands rest below your belly button
- Gesture beyond your shoulders
Voice
Vocal stamina
- build up vocal stamina
- read things out loud 5 to 10 min a day
Variety
- Put emotive word in what we say. “This is a GREAT opportunity …”
- Record yourself and listen. Need inflection.
Warm-up
- “teacup hiccup” x 3
Eye Contact
- Eye contact is expected.
- Create quadrant and look at the general direction in the audience according to the quadrants
- We tend to believe more what we see. (in the presentation, Matt illustrated this point by saying putting hand on the chin while putting his hand on his cheek. The audience followed)
Content - Understand your audience and their needs
- It’s not about what you want to say, but what your audience need to hear.
- Consider the context
- Think about your goal: what do you want your audience to: know, feel, do? (information, emotion and action)
- Structure sets expectations ** problem/opportunity - solution - benefit ** what? so what? now what? ** only after the structure then you create the slides
Polyglot Persistence
The first talk I went to was Polyglot Persistence from Roopa Tangirala from Netflix. She covered 5 use cases of data persistence at Netflix, the challenges and the approaches to the data problem.
One of the promises of microservice architecture is that you’re no longer bound to a single or homogenous data store. Each component in the microservice stack can choose their own data store according to their use case and what Roopa has demonstrated here was a prime example of such promise.
Data stores
The data stores she talked about in the talk were:
- Elasticsearch: For search analyze and visualize in near real time
- EVCache: Distributed in-memory caching solution based on memcached
- Cassandra: Distributed nosql database to handle large datasets providing high availability
- Dynamite: Distributed dynamo layer for different storage engines and protocols supporting Redis, memcached, RocksDB
- TitanDB: Scalable graph database optimized for storing and querying graph datasets
CDN URL service
requirements: - high availability - very low r/w latency (less than 1ms) - high throughtput per node
EVCache is chosen. In-memory so it provides very low latency.
Playback Manifest service
requirements: - quick incident resolution - interactive dashboards - near realtime search - adhoc queries
Elasticsearch is chosen. Powerful search & analytics and interactive dashboards.
Viewing History
requirements: - time series dataset - support high writes - cross region replication - large dataset
Cassandra is chosen. Multi-datacentre, multi-directional replication and HA and highly scalable.
Digital Asset Management
requirements: - one backend plane for all asset metadata - storage of relationships/connected data - searchable
TitanDB. Distributed graph db and support for vaious storage backends.
Distributed delayed queues
requirements: - distributed - highly concurrent - at-least-once delivery semantics - delayed queue - priorities within the shard
Dynamite. Pluggable data store supporting Redis and multi-datacentre replication. They tried to use cassandra as a queue but it’s not great at being a backend for queues.
Challenges & Approaches
- Capacity planning
- Maintenance
- Monitoring
Approaches
- Subject matter experts (2 or 3). Work with microservice team to come up with requirements.
- CDE service (a self-serving service to provision data stores)
- thresholds/SLAs
- cluster metadata
- self service
- Proactive maintenance
- don’t wait for amazon to terminate the instances
- Upgrade
- Use NdBench to build up confidence
- Open source and have pluggable backends
$200 Self-Driving cars with RasPi and Tensor Flow
This talk has less relevance to my day-to-day job but more of a curiosity of mine. However, I didn’t get too much out of it. The project itself is pretty cool and it sounds like something you can DIY because all the components are low-cost and instructions are freely available.
Istio - Weaving the service mesh
Service mesh seems to be the next battle ground after the container orchestration layer. Since Kubernetes is “winning” the orchestration war, the battle has gone up the stack.