Diego Update

so I’m here to talk about Diego I wanted to first make a big shout out to Rene French who designed the Go gopher when I have appropriated very lovingly to try to give us some iconography to play with today so I want to tell you about what’s new with Diego and last year I had this wonderful opportunity to tell a story it was a story that had at the beginning of a problem and then a proposed plan that culminated in solution it was a story perhaps of hope a new hope this year hmm it’s complicated but there’s a plot twist but I’ll give it away turns out Diego is lattices father so I guess this is really the AYGO Strikes Back let’s talk about we’re actually gonna talk about and I just had five things first is just a quick recap of what Diego is which will oddly lead us to talk about what a container actually is and then I want to just briefly talk about Diego’s evolution which will take us to lattice and then talk about the future okay so let’s start with what is Diego so at its core Diego started off as a rewrite a rewrite of the Cloud Foundry runtime and that means the DEA is Health Manager warden we decided to add all these things in go so go so Diego that’s the name that’s where it came from that doesn’t tell you what it is so it turns out Diego is a distributed system that orchestrates containerized workloads well let’s dig into each of these pieces so the distributed system if you were to you know look into a running Diego installation you would see a pile of VMs that we call cells now these are the workhorses these are the big beefy machines that have all of the containers running all the applications on them you’d also see a handful of highly available VMs that we call the brain these have some functionality that I’ll get into in a second you’d also see a handful of VMs called the BBS and the BBBS is really a centralized data store that we use to coordinate information in the cluster it’s how the cluster gets to sort of solve the distributed systems problem and we’re currently relying on NCD to give us consistency in the VBS now that’s the distributed system what does it mean that Diego is an Orchestrator well Diego’s orchestration responsibilities really fall into two things first is Diego as a scheduler so when you bring your workload to Diego Diego will try to optimally distribute it across the running cells and as more work appears that will do its best to balance that workload across the entire set of cells that are running across availability zones if possible Diego is also a health monitor so if your application crashes Diego will notice and restart it this applies on a macro scale to of an entire cell crashes Diego will notice and save those applications but what I really want to talk about is what it means to have a containerized workload what is it that Diego is actually running well we have this interesting abstraction we have one-off tasks we can run one of tasks in containers and we can run long-running processes and containers now a one-off task is easy to understand it’s the unit of work that runs at most once inside a container a long-running process is a little more complex we would have a number n of long-running instances that we would distribute across the cells for high availability and monitor and restart in the case of failure this generic platform-independent abstraction sort of describes what Diego can do and here’s what we actually done with it we’re able to take your droplet the product of running CF push and run a build pack based application on Diego we’re also able using the same abstraction to run a docker based application on Diego and we’re even able to run a Windows based application on the same Diego cluster now what’s cool is I sort of previewed this last year while all of this is working today and it’s a very exciting we think that this is this abstraction has been successful we’re seeing it prove itself out but I think there’s a lot of confusion what does it mean to run these sorts of things inside a container isn’t container synonymous with docker what are these other two things if that’s the case what is diego’s relationship to docker anyway what is a container let’s talk about that at its core a container is about isolation when you’re on enough

shared hosts you have a set of shared resources and if you’re multiple tenants on that host you want to run your various processes now these processes are vying for these shared resources and the way they have access to them is of course through the kernel isolation is all about isolating these resources and it comes in two flavors there’s a resource isolation and namespace isolation let me dig into the first one first so resource isolation is easy to understand you have a single CPU or a set of cores on your shared box and you have the multiple tenants vying for that CPU now in an ideal world each process is using its fair share of the CPU but what happens if process a starts to run awry and begins to soak up the CPU on the box well this is bad tenant one is taking up more resources than they should and the other processes the other tenants are being crowded out you need some sort of isolation in this montec multi-tenant context well glue Nick’s kernel has this great feature called C groups and the lets us build barriers between the different tenants when with these barriers we can make certain guarantees the tenant one cannot exceed his or her threshold the ten and two and ten and three are safe that’s resource isolation namespace isolation is similar but complicated and different let me pick an example let’s think about the process ID so in Linux each process has associated with it a pit an integer that you can use to refer to the process now tenant one process B can look at the pins associated with its tenant which is what you want but it can also look at the processes associated with other tenants which is bad that breaks isolation again the Linux kernel comes to the rescue with the pin namespace this allows us to set up barriers that prevent tenant one from peering into tenant two and three but it’s a bit stronger than that this is actually a namespace and each tenant has their own namespace into the pit world and so they can reuse pins without actually conflicting it takes this global resource and really buckets it up nicely you’re getting close to imagining that each tenant is running its own VM on the one VM there’s other Isolators that the kernel provides in addition to the pit namespace there’s the network namespace the mount namespace network namespace for isolating networking concerns then the mountain namespace for isolating file based concerns in the user name space to make sure that users and different tenants can’t do notorious things so what is a container well it starts with isolation and isolation really is is just a series of walls that if you construct together correctly give you isolation and I want to emphasize that this is a feature of the Linux kernel and it’s a very powerful feature but who cares what goes inside the walls is what you care about as a developer and really that breaks down into two things there’s contents the actual files that go into the container and there’s processes the stuff you actually want to run in the container you put these three things together that’s a container so Diego runs tasks and long-running processes in containers and in particular the the implementation of containers that we use is this thing called garden that we built ourselves now why well gardens really powerful garden allows Diego to programmatically and independently say these three things it allows the anger to say make me a container put this in it now go run this and it does this through a platform agnostic API garden allows the ego’s abstractions to be very flexible and to support these three very different than very important and interesting use cases so let’s dive into them let’s look at CF push to understand CF push you have to embrace the see f.push haiku which is this here is my source code run it on the cloud for me I do not care how right so what does this look like well we take your source code we run a task on Diego and produce something called a droplet this is where all your built acts are doing their work and we call this staging now what is this droplet well you can think of it as a compiled asset it contains your application and any application specific dependencies if it’s a rails application it has all your gems bundled right in but that’s all it has and so it can’t run on its own it means this particular execution context upon which to run we have a name for that context it’s just a series of files that you need to bring alongside the droplet for it to run it’s our root filesystem the current one is CF Linux FS 2 which is a mouthful so how does the droplet run on Diego well the way anything runs on Diego

through a long-running process this LRP so the lrp allows us to specify hey I want a container and to specify the contents of the container in this case I want a container that has in it this root filesystem CF Linux FS 2 and then Diego can say that lrp can tell Diego to download the droplet onto that root of s and then spin up the start command the metadata for the start command comes out of the staging process so isolation content process that’s CF push and you can see it if you look at the code if you look at the definition for a droplet it has a bit for isolation I want 128 megabytes container a bit about what to put inside give me this route if s download this droplet and a bit about what to run it really elegantly brings these three independent things together right that’s cf Bush well how does docker fit into this dog Erb is very different but it comes down to the same basic paradigm the contents in docker are described by your docker image it contains the files that you want to run in the container and the set of process T’s to run that comes from the docker now all of this stuff comes from the docker registry and docker really nailed this right they’ve made it really easy to push out an image to specify what you want to run to tweak it and then tillis launched a container that runs that stuff but it’s important to understand the isolation bits that’s the UNIX kernel and you can see this in diego’s LRP we’re just doing isolation we’re asking for a docker image for the contents which docker is a first-class thing that we support and then we’re saying hey based on the docker image metadata go run this doctrine and so how does Diego relate to docker it’s real simple you can put anything you want in here and one of the things you could put in there is a docker image and once you’ve got that docker image you can run anything you want in here and one of the things you can run in there is the metadata associated with that docker image that’s how docker runs on Diego now what’s cool is this is really flexible and be really easy to have app see moaning on Diego and that’s something that we hope to do eventually we’re not quite doing it Yeah right that’s docker fast forward to Windows what does that mean well I just talked about all of this Linux kernel stuff resource isolation and cgroups namespace isolation with all these networking with all these namespaces well it turns out you can do something similar with Windows you can do resource isolation with the kernel job object and you can do namespace isolation and in fact we’re running your application in an isolated iis instance we’re collaborating with Microsoft on this and it’s allowing us to build to build a CF push experience that’s that’s working today it provides a container experience for Windows 2012 that we believe will only get better with Windows 2016 so you have these two very different platforms how does Diego communicate with them again this is the beauty of garden through one single interface which means that you can just define a dotnet lrp that looks just like your build pack LRP or your docker lrp it talks about isolation it talks about stuff to download which includes information about the route if s in this case windows which allows diego to figure out where to put the LRP and then includes metadata on to run so these three very different contexts all run on one Orchestrator it’s pretty cool all right so let me tell you about how that Orchestrator has evolved so I want to talk about two things here I want to talk about the scheduler and I want to talk about an API so let’s start with the scheduler so we’re used to thinking of architecture as this thing that comes in from above and tells us what code to write and that’s true and that people we do a lot of test-driven development so we’re used to also thinking of tests as something that does that so you always want your tests first and your tests influence what code you write but that’s not all that there is to test different development your code also feeds back into your tests and you get this nice virtuous cycle between tests and code this is at the heart of TDD it’s all about quick feedback loops now your architecture also informs what tests you ought to write you need integration tests to make sure all your components work together but in a complex system you also need simulations and you need performance tests to make sure that everything works correctly we love the fact that these two arrows point back and forth we love these feedback loops and we’re finding it really important to have feedback loops back into architecture this is the most useful definition for agile architecture that I can come up with it’s all about feedback loops it’s all about the stuff that you build informing your vision for how to build it so last year I made a lot of noise about this distributed auction and simulation driven development so this is what it looks like you have your cells and with

the distributed auction architecture you have a scheduler on each cell now when work comes in the schedulers we call them auctioneers can talk to each other and figure out where to place the workload now this was really cool and it worked and we ran a bunch of simulations it informed the code and informed the tests and we ran a bunch of simulations to make sure that it actually worked and you know the simulations were running fine at the hundred sale scale then of course we made them more realistic and went up to 200 cells it started to falter so we added some code then we made it better and then we made it bigger it broke again so we added some code and it worked again and at this point we go okay how about thousands of cells and at what cost things were getting complex and it was time for architecture to change so we stepped back and we did something very simple we moved to a centralized highly available scheduler Bezos does this kubernetes board does this it’s just simpler this way okay let me talk about API so api’s when you when you say CF push you’re talking to the Cloud Controller which turns around and talks to a pool of de A’s and asks them to stage and run now when we started off our mandate was rewrite the DA’s and so we wanted to we wanted to do it in a cleaner way and one of the things we knew was that the left-hand side here was very app specific and we wanted something a lot more generic so we built this bridge called the CC bridge that translated from this app specific domain to this more generic domain and then we went off and built all of Diego now this was working really good but we made a mistake I wouldn’t say mistake he started off thinking of all of this as Diego and because we were thinking of all of this as Diego we made an interesting decision let me phrase it that way we had the CC bridge talking directly to the database now that’s fine that helped us bootstrap and get working real quick but a database is not an API so we step back and we said well really this is Diego and if this is Diego then really Diego should have an API and so we built one we call it the receptor API and if you have an API well then that’s obviously what CC branch to talk to well now you get an interesting picture CC bridge CC who cares that’s just a generic consumer of this API what if you had another consumer well that’s cool and that’s where lattice was born so here’s lattice you take this picture you have this distributed system that can run your workload but this kind of nah who cares surrounding my containers how do I get to them well we realized that if we added the go router layer we can do HTTP traffic to your containers and if we added the logging and metrics layer we could pull out logs and metrics from your applications and what if we took all of the disks and packaged it up and made it really easy to install vagrant up or to start a cluster terraform apply and what if we gave you a little command-line tool to create and manage your applications that was lattice you can run it on your local VM or a terraform you can deploy to AWS digitalocean Google Cloud and thanks to the community OpenStack that was a PR that was awesome so I’m gonna talk about two things real quick with lattice the first is what is the relationship between lattice and Cloud Foundry again there’s a lot of confusion here and the second is real quick why did we do this so what is the relationship so Cloud Foundry is really the union of all of these things combined cough controller the UA a Diego logger Gator go router build pack services Bosh all of these things laddus comes in right here it’s these three things it’s sort of Cloud Foundry by subtraction as James bear likes to say so what what don’t you get with lattice well you don’t get the CC in you a a which means that it’s really a single tenant environment you don’t get built packs yet which means that we’re relying on docker to distribute your bits which is fine you don’t get services so you really have to bring your own and you don’t have Bosh we’ve made it real easy to deploy and Bosh just isn’t easy to deploy there are implications to that it means you don’t have rolling upgrades sort of out of the box you sort of have to figure that out yourself it’s possible we just don’t make it particularly easy at the end of the day lattice really gives you a cluster root experience and we just want to encourage people to go and play explore these technologies so why lattice well we think it’s a useful low barrier solution that solves real world problems and we just wanted to get it out there so that people could play with it we think it makes exploring Diego a lot easier we feel it’s a softer on-ramp to the CF tech stack to just introduce more and more people to Cloud Foundry and actually this has been really useful we’re finding it allows us to efficiently prototype new ideas internally I’ll talk about it a little

bit but we have a lot of new initiatives that we’re just saying hey let’s go build down on lattice see it work and then bring it into the platform which leads us to the future so what’s coming well the first question everyone’s asking is when and I will just gently say hey Diego scope is a lot bigger than just rewrite the de A’s you can do lattice you can do Windows it can do docker okay but when are you going to ship well Diego is running in production on PWS it’s handling about 5% of the load more importantly it’s running all of pivotal internal applications so that’s that’s great okay but when can i play with it well it’s in beta while we validate our performance at hundreds of cells and do some internal security work to make sure that all’s well okay but when can i play with it well I want you to start using it today you can start getting us feedback soon or about when will it be finished should be out of beta with in q3 probably alright okay then what okay this is the exciting part so a place with constraints having placement pools so you can have different workloads on different cells that’s top of the backlog post beta CFS Sh I want to SSH into my running container index give it to me working now it’s working now the CLI support is on the way this will ship with Diego give you shell access port forwarding SCP all this good stuff if you’re administrating the cluster don’t worry you can turn it off but if you’re a developer rejoice TCP routing we’re kicking us off with GE it’s very exciting I encourage you to go check out a tools talk private docker registry in collaboration with s AP check out your geese talk on Tuesday support for persistence so persistent disk it’s a long-term goal and we’ve we’ve done some experiments that Caleb and Ted are going to report on check out their talk on Tuesday and container to container networking some sort of overlay networking story that’s a long-term goal we just don’t need it to replace the da’s we don’t need it for the CF push workflow when we recognize that it’s something that we want to bring to the platform and it sits in Diego’s future and finally condenser which is what I alluded to earlier lightweight build packs for lattice bringing that CF experience CF push and finding a minimal subset that’s actually useful and fun to play with and bringing it to lattice we’re excited to do that alright that’s the future I have an open house today at 1:30 come and ask me questions then or right now and that is all thanks so it’s X ray no no live demos real number one I don’t have an environment set up but I didn’t get to talk about extra I can talk about X ray so we have this receptor API and it gives you full villas visibility into what the cluster is running so we built this really cool UI on top of it that lets you just add a glance see what the cluster is running and understand whether there any problems oh we have a mic thank you any other questions how much time do we have five minutes no okay with da we had quite a lot of problems with rebalancing clusters when there’s recovery from failures things like that I said the Diego would be one project that really couldn’t fix in that is a plan for Diego looking at those kind of distribution algorithms and redistribution things like that in rebalancing yeah so the question is rebalancing we have a story for rebalancing in our backlog we want to do it we don’t we’re not going to do it before we sort of ship and it’s something that we actually want to do in such a way that it’s just always happening we don’t want there to be a button that you can press to then just magically change the entire system and so Diego will just sort of naturally because you’re running a twelve factor application to identify applications that it can move to improve the distribution on the cluster that’s planned but it’s not there yet so you mentioned Apache mesos could you compare and contrast sort of what the difference is between what Diego is providing and what Apache Miso’s provides what maze us provides sure so I get this question a lot why didn’t you build Diego and Thabo from ASOS and it’s a good question and in some ways we could have but they were just like a couple of key things that we needed that we didn’t think we can get out of Mesa so Windows support was one of them so we can actually do Windows that’s actually working today and that’s just not really a thing that amazed us certainly at the time could support I don’t think it does yet either yeah so I’d say that the other thing was that maze was really giving us this sort of scheduler piece and there’s just a

lot more to it than that and so we would have had to build a lot of stuff in addition to the scheduler anyway the nice thing with Mavis is that then your scheduling can live alongside other schedulers and so I could imagine sort of a plug-in for mazes that allows Diego scheduler to just piggyback on that the only thing stopping us from doing that is priority and time frankly these things can all sort of overlay and intermix pretty easily sure so that’s interesting so the question is what about auto-scaling you said da’s I’ll say cells right the the work pool that runs the containers and that’s that’s interesting that would be the first time that we have an arrow pointing from the runtime into Bosh right or whatever is orchestrating your your cluster it’s definitely something that is very doable and that we would consider doing but again I just go back to priorities in time right is it in the future I imagine so I imagine a full-blown solution where an operator just says hey you can have at most a hundred cells grow as you need to but don’t use my resources until you need to I think that’s that’s probably going to come but no no concrete plans at this time so you might have answered this question before in the mailing this but I feel I figured out ask it so it’s very cool but you have especially on lattice like a simpler version for developers but as you know and as we all know anything that starts very simple tend to get complicated so in other words is there a guarantee that lattice doesn’t become the new CF how does lattice not become the new CF discipline I’m not sure it’s a good question let’s let’s see where lattice goes I think it’s still early days for lattice you’ll always have CF so how does the Windows isolation work seems like a lot of the isolation is provided by iis does that mean like people can’t write worker type apps yet like is it more web apps I’m gonna ask mark to come and answer that we were micro service architecture mark is the windows micro-service sounded like jesse so yeah so isolation of windows is very different and in Windows when you have a web workload it assumes that that web workload is integrated into the operating system itself is so it’s not like in a Linux container we can just start a process is just a process where in Windows you have many flavors of processes so there are kernel primitives that allow you to go and isolate different types of workloads and windows so what we will expose to diego is going to be slightly different than what you’d see in linux so today in windows 2012 you’re gonna have the HW c and a lot of other mechanisms to isolate win windows like web workloads and then we’re working on background tasks next alright looks like we’re out of time thanks all like I said