I've been using Kubernetes on Azure and GCE recently and it's absolutely wonderful.
I was able to setup an entire ecosystem from scratch in a week that scales well and can be managed in one location.
When I first looked at Kubernetes, the complicated part was setting it up on a cluster. If you use Kubernetes on GCE, or Azure, you don't have to do that step, everything else is ready to go for you!
- Automatic scaling of your application
- Service discovery
- Secrets and config management
- Logging in one central dashboard
- Able to deploy various complicated, distributed, pieces of software very easily using Helm (Jenkins, Kafka, Grafana + Prometheus)
- Able to add new nodes to the cluster easily
- Health checks and automatic restarts
- Able to deploy any container to your cluster in a really simple way (if you look at Deployments, it's really simple.)
- Switch between cloud providers and still maintain the same workflow.
I won't ever touch Ansible again, I really prefer the Kubernetes way of handling operations (it's like a live organism instead of something you apply changes to.)
Also, the entire argument that you probably don't need Kubernetes because your organization doesn't have 10s, or 100s of nodes just doesn't make sense after using it.
Having a Kubernetes cluster with 3 nodes is 100% worth it, even for rather simple applications in my opinion. The benefits are just way too good.
I'm glad you mentioned Azure and GCE -- I started off with minikube locally and things didnt really become magical until i started using a kops-deployed AWS-EC2 cluster and saw the magic of elastic scaling, ease of deployment, and migrate-ability.
PS - Could you share which books/websites/resources you used to get up to speed to the point where you're at now?
Would you mind sharing links to any of the resources you used to get yourself going with kops on AWS? I'm in the midst of setting up my first cluster as we speak.
Not sure why you would throw Ansible into your pod.
Someone has to build your docker images Jenkins, Kafka and Grafana + Prometheus.
I also like Kubernetes but i don't think Kubernetes is necessary for small controlled environments.
Ansible just works with help of galaxy. I set up Jenkins, Kafka, Grafana + Prometheus faster with ansible than with kubernetes with more/easier control, specificly when i take care of stuff inside those services which are not yet thought of in the docker container.
Also the scaling thing, small companies just don't need.
EDIT: I should preface my little rant by saying that this post is one of the best I've seen at explaining the basic concepts of Kubernetes. But obviously I'm not an expert :)
> The Kubernetes API should now be available at http://localhost:8001, and the dashboard at this rather complicated URL. It used to be reachable at http://localhost:8001/ui, but this has been changed due to what I gather are security reasons.
I was playing around with GCE Hosted Kubernetes about a year ago, and things were pretty clear as far as I recall. I've read lots of positive things, and figured it's a good way to start.
Then I tried again recently, and I couldn't even get to the dashboard. Eventually after several cryptic StackOverflow copy&pastes I managed to load it (don't even remember how), only for the session to expire after 10 minutes or so... It was utterly frustrating. I didn't actually get to the more interesting part I was planning to play with as a result...
People say that there's a learning curve, and I get it. And also I'm not even trying to install Kubernetes on my own, but try to use a hosted service. I'm also pretty switched on when it comes to security and trying new things (or I'd like to think I am), but there are some things that feel like too much of an obstacle for me unfortunately.
GKE actively disadvises from using the Kubernetes Dashboard and recommends their dashboard as a replacement.
On their website [1] they list the following:
> Caution: As of September 2017, Kubernetes Dashboard is deprecated in Kubernetes Engine. You can use the dashboards described on this page to monitor your clusters' performance, workloads, and resources.
That's good to know. I guess I was just heading in the wrong direction from the start. I think I did try the built-in dashboard on the GCE web console, but somehow didn't see any of the stuff I was expecting (but maybe I've just assumed the "real" dashboard is the one I was trying to load).
I gave up on my own Kubernetes writeup a while back. I just had a lot of trouble with basic networking configuration, logging, etc.
I've been at one shop with a large scale DC/OS installation. You can run a k8s scheduler on DC/OS, but by default it uses Marathon. DC/OS has it's own problems for sure, and both tools require a full time team of at least 3 people (we had 8~10) and there are a lot of things that will probably need to be customized for your shop (which labels to use, scripts to setup your ingress/egress points in AWS, HAproxy configuration or marathon-lb configuration .. which is just a haproxy container/wrapper), but I think I still prefer marathon.
I briefly played with nomad and which I had spent more time with it. I know people from at least one startup around where I live using it in production. It seems to be a bit more minimal and potentially more sane.
The thing I hate about all of these is there is no 1 to n scaling. For a simple project, I can't just setup one node with a minimal scheduler. DC/OS is going to cost you ~$120 a month for one non-redundant node:
I hear people talk about minicube, but that's not something you can expand from one node to 100 right? You still have to build out a real k8s cluster at some point. All of these tools are just frontends around a scheduling and container engine (typically Docker and VMs) that track which containers are running where and track networking between nodes (and you often still have to chose and configure that networking layer .. weavenet, flannel, etc).
I know someone will probably mention Rancer, and I should probably look at it again, but last time I looked I felt it was all point-n-click GUI and not enough command line flags (or at least not enough documented CLI) to really be used in an infrastructure as code fashion.
I feel like there's still a big missing piece of the docker ecosystem, a really simple scheduler that can easily be stood up on new nodes and attach them to an existing cluster, and has a simply way of handling public IPs for web apps/haproxy containers. I know you can do this with K8s, DC/OS, etc. But there is a lot of prep work that has to be done first.
> All of these tools are just frontends around a scheduling and container engine ...
Well, that's a gross simplification of what Kubernetes is. (I don't know about Marathon.)
Kubernetes is a "choreographer" of cluster operations. At its core it's a consistent object store that contains a description of the state you want your cluster to be in. Various controllers monitor this store and try to "reconcile" the real world with the desired state. Operations include things like creating persistent volumes, setting up networking rules, and, of course, running applications. To say that it's a frontend for a container engine is a bit misleading, since Kubernetes can control so much more.
It's a nicely layered system — a "pod" describes the desired state of a single instance of an app, a "replica set" describes the desired state of a set of pods, a "deployment" describes the desired incremental rollout of a replica set, and so on. It's also a design that scales down to a single node (hence the popularity of Minikube as well as Docker for Mac, which includes Kubernetes), as well as up.
It's also a design that means that with a few exceptions, your configuration can target any Kubernetes cluster, not a specific cloud vendor. Without a single modification, I can deploy my app to the local Kubernetes on my laptop, or to our production cluster on Google Cloud. While migrating to Docker/Kubernetes took nearly a year, migrating away from GCP would take us probably less than a week (most of it involving pointing DNS to new load balancers, and moving persistent volumes over).
Beyond Google Kubernetes Engine and various other clouds (Azure is apparently very good), there's a bunch of tools now that do the heavy lifting of creating a cluster somewhere. Kubeadm and Kops are both popular.
It abstracts away all the work of setting up a close to production HA cluster, so you can jump quickly into developing & deploying your app. You can start with 1 node and ask GKE to scale to N when you want it.
it basically comes down to bootstrapping it like normal and then removing the 'node-role.kubernetes.io/master' taint so that things can run on the master node.
The one area in kubeadm that is still being worked on is bootstrapping a HA cluster, but if you don't mind having a single master node, you can easily bootstrap a cluster and then add nodes to it later.
I'm re-evaluating k8s again - tried it one or two years ago and hit some roadblocks for my use-case.
Kelsey's tutorial is a bit outdated (Oct 2 2017 with k8s v1.8, v1.11 just got released). Here is a link to the official kubeadm guide for Creating a single master cluster with kubeadm:
on a server / VM (after installing docker and kubeadm, of course). Add a pod network add-on (Calico seems to work well, 2 commands to install), remove the mentioned taint and optionally join more worker nodes (also a single kubeadm command). Every step is in the guide, just copy & paste. ;)
Note: This is no production-ready cluster (it has a single master), also you should have some basic understanding of k8s, which the OP provides. I also highly recommend to dig around kubernetes.io/docs - good material there.
I started with kubeadm some days before the release of k8s v1.11, which made some stuff I wrote obsolete, oh well... :) I really like the new kubeadm phase stuff, though.
There is also an official guide for Creating Highly Available Clusters with kubeadm (it's updated for v1.11) which I just went through:
I opted for the "stacked masters" approach (run HA etcd right on the k8s master nodes), wrote some ansible tasks to automate to boring stuff like copying configs/certs etc., and am currently (re-)exploring add-ons and advanced functionality (helm, network policies, ingress controller, ceph via helm, ...).
I have yet to see a good tutorial that shows an automated build of a kubernetes cluster. Yeah, you can use GKE, but that gets to be prohibitively expensive.
Mist.io provides a Cloudify blueprint that can be used to deploy and scale Kubernetes clusters on any supported cloud. It's using kubeadm under the hood.
I really want to like Kubernetes, but going beyond the basics seems to require a way higher understanding of systems engineering than I currently have. Yes I know you can create container networks and stateful pods with attached storage, but how is always seemingly beyond me. Network and storage in distributed computing is hard and Kubernetes seems to be a slightly more magical bullet than Docker Swarm alone.
Completely agree... Going beyond the basics is really hard. But as I understand it's because of inadequate knowledge of advanced network/storage virtualization concepts. Any help on how to get started in those? As a side note I have decent knowledge of basic networking/storage.
Look up persistent volume claims and dynamic provisioning. If you cluster is properly configured, then you are one yaml file away from having that. By "configured properly" this means having the correct cloud provider set, otherwise it cannot talk to the APIs.
To be fair, this stateful containers in general are a relatively new thing in K8s and support has been improving.
Also, K8s is trying to do and abstract away a lot, it is more like a distributed operating system by itself. So it is more complicated than swarm.
We use kubernetes to spin up the application that I work on (in private cloud and at some point in hybrid and public cloud) deployments. It’s an end user installed tool. In deployment, about 1/4 of the new installations fail because of some problem or another. Either the GPU plugins for NVIDIA weren’t loaded correctly, kube-dns won’t start because docker0 isn’t in a “trusted” zone in redhat (not being in trusted seems to cause iptables to subtly screw up container to container communication between the various private networks), or helm just decides that it can’t start.
Are we doing it wrong?
We’re using hyperkube and k8s 1.8 which came out around q4 of last year.
Almost all of these I can trace back to user error (ie we told folks to do X, they didn’t, and stuff broke). We’re now having to write a preflight checklist of sorts that the app runs through to make sure A bunch of stuff is “ok.” That in itself becomes brittle in my experience so I’m reluctant to do that.
It requires considerable operational experience and effort to run well on bare metal. Have you considered experimenting with a managed Kubernetes offering? Out of EKS, AKS, and GKE, I cannot recommend GKE highly enough.
Even if you can’t use it for production, it’s highly worth setting up a prototype environment on GKE to see what it gives you. I believe they now have GPU support, including support for preemptible GPUs (much cheaper).
Also GKE does a good job of staying up to date with releases. They are on 1.10, with support expected soon for 1.11. They _fully_ manage the upgrade of both the Kubernetes and etcd masters, and the worker nodes.
As mentioned in the linked blog post, Kubernetes is a fast moving project, and to use it you should plan and allocate significant resources in your team to keeping up with the new releases. There are a large number of fixes and improvements since 1.8 and I would look very seriously at both upgrading, and changing your processes to allow you to stay closer to the current release version.
The Kubernetes project does not, and has no current plans to, have a long term support release.
What is now happening around Kubernetes is that various companies and projects are coming up with "distributions" of their own. These package the underlying OS, the Kubernetes binaries, and a number of built-in system pods that take care of various things such as logging, scheduling and monitoring. Red Hat's OpenShift is one of them, as well Weaveworks. Maybe try having a look at one such solution?
There is also the Canonical Distribution of Kubernetes (CDK).
If you want to try a stripped down version on your local machine, kubernetes-core can be installed into LXD containers via conjure-up with a click of the fingers - nice for playing around.
Not sure if I got the scenario. Are you installing K8s on bare metal using a previously installed OS that you do not control?
If so, that will be harsh. I am not sure how to help you there. There are many things that could go wrong as you say.
If you do have control over the OS (you are providing an ISO or virtualization image) then it should mostly "just work". My company is doing a similar thing, only we ship boxes with everything pre-installed. There is another scenario for hybrid cloud, but even then they download a VMWare image.
Also if you do have control over the OS: can you use CoreOS instead? It is very well suited for running K8s and has less things that can go wrong. RedHat bought them anyway. With Ignition (or even the old-fashioned cloud init with a config drive), it is a no-touch deployment (you do have to generate and inject the certificates beforehand).
One thing that sounds weird is that you are "telling folks to do X". Can you avoid telling them anything and have it automated?
actually you should probably look into kubeadm or bootkube.
Currently I tried to have some kind of scratch cluster myself, however it's just way easier to maintain/update clusters that are on kubeadm or bootkube.
also stuff that you describe does not look like some user error.
btw. you should not expose the deployment yaml/json to "users"/"developers".
You should have a ci that just runs `kubectl set image deployment/name pod-name=IMAGE` and keep all deployment descriptors, etc in a seperate source repository.
Isn't this exactly what you should expect from kubernetes? If you can't afford a team of people working on it you are supposed to give up and accept GKE lock-in.
I don't think GKE is lockin when there are other offering from other companies and the migration is quite simple. Also if at some point you feel like it's time to host your own then you can do it. Also I don't find GKE that expensive especially paired with their cheap vms or preemptive vms if you're fine with those.
We are working on a project with standard LXC containers [1] which tries to make orchestration and some of this stuff especially networking simpler.
We support provisioning servers, building overlay networks with Vxlan, BGP & Wireguard, distributed storage and rolling out things like service discovery, load balancers and HA.
It may be worth exploring for those struggling with some of the complexity around container deployments. At the minimum it will help you understand more about containers, networking and orchestration.
I wish I could find a tutorial for bare metal:
- how to setup cert creation for fqdn with Cloudflare
- storage
- ingress to support multiple ips glued to different nodes (so if service gets IP x it gets routed through node z that has this external IP)
I spent 6 months trying to do that and no luck.
I was able to setup an entire ecosystem from scratch in a week that scales well and can be managed in one location.
When I first looked at Kubernetes, the complicated part was setting it up on a cluster. If you use Kubernetes on GCE, or Azure, you don't have to do that step, everything else is ready to go for you!
- Automatic scaling of your application
- Service discovery
- Secrets and config management
- Logging in one central dashboard
- Able to deploy various complicated, distributed, pieces of software very easily using Helm (Jenkins, Kafka, Grafana + Prometheus)
- Able to add new nodes to the cluster easily
- Health checks and automatic restarts
- Able to deploy any container to your cluster in a really simple way (if you look at Deployments, it's really simple.)
- Switch between cloud providers and still maintain the same workflow.
I won't ever touch Ansible again, I really prefer the Kubernetes way of handling operations (it's like a live organism instead of something you apply changes to.)
Also, the entire argument that you probably don't need Kubernetes because your organization doesn't have 10s, or 100s of nodes just doesn't make sense after using it.
Having a Kubernetes cluster with 3 nodes is 100% worth it, even for rather simple applications in my opinion. The benefits are just way too good.
PS - Could you share which books/websites/resources you used to get up to speed to the point where you're at now?
Someone has to build your docker images Jenkins, Kafka and Grafana + Prometheus.
I also like Kubernetes but i don't think Kubernetes is necessary for small controlled environments.
Ansible just works with help of galaxy. I set up Jenkins, Kafka, Grafana + Prometheus faster with ansible than with kubernetes with more/easier control, specificly when i take care of stuff inside those services which are not yet thought of in the docker container.
Also the scaling thing, small companies just don't need.
There is a level of complexity to kubernetes.
And if you needed to leave GKE to another provider or packaged , how would that look?
Thanks for any answers!
> The Kubernetes API should now be available at http://localhost:8001, and the dashboard at this rather complicated URL. It used to be reachable at http://localhost:8001/ui, but this has been changed due to what I gather are security reasons.
I was playing around with GCE Hosted Kubernetes about a year ago, and things were pretty clear as far as I recall. I've read lots of positive things, and figured it's a good way to start.
Then I tried again recently, and I couldn't even get to the dashboard. Eventually after several cryptic StackOverflow copy&pastes I managed to load it (don't even remember how), only for the session to expire after 10 minutes or so... It was utterly frustrating. I didn't actually get to the more interesting part I was planning to play with as a result...
People say that there's a learning curve, and I get it. And also I'm not even trying to install Kubernetes on my own, but try to use a hosted service. I'm also pretty switched on when it comes to security and trying new things (or I'd like to think I am), but there are some things that feel like too much of an obstacle for me unfortunately.
On their website [1] they list the following:
> Caution: As of September 2017, Kubernetes Dashboard is deprecated in Kubernetes Engine. You can use the dashboards described on this page to monitor your clusters' performance, workloads, and resources.
[1] https://cloud.google.com/kubernetes-engine/docs/concepts/das...
https://cloud.google.com/kubernetes-engine/kubernetes-comic/
PS. McCloud? I sense nominative determinism at play.
Deleted Comment
I've been at one shop with a large scale DC/OS installation. You can run a k8s scheduler on DC/OS, but by default it uses Marathon. DC/OS has it's own problems for sure, and both tools require a full time team of at least 3 people (we had 8~10) and there are a lot of things that will probably need to be customized for your shop (which labels to use, scripts to setup your ingress/egress points in AWS, HAproxy configuration or marathon-lb configuration .. which is just a haproxy container/wrapper), but I think I still prefer marathon.
I briefly played with nomad and which I had spent more time with it. I know people from at least one startup around where I live using it in production. It seems to be a bit more minimal and potentially more sane.
The thing I hate about all of these is there is no 1 to n scaling. For a simple project, I can't just setup one node with a minimal scheduler. DC/OS is going to cost you ~$120 a month for one non-redundant node:
https://penguindreams.org/blog/installing-mesosphere-dcos-on...
I hear people talk about minicube, but that's not something you can expand from one node to 100 right? You still have to build out a real k8s cluster at some point. All of these tools are just frontends around a scheduling and container engine (typically Docker and VMs) that track which containers are running where and track networking between nodes (and you often still have to chose and configure that networking layer .. weavenet, flannel, etc).
I know someone will probably mention Rancer, and I should probably look at it again, but last time I looked I felt it was all point-n-click GUI and not enough command line flags (or at least not enough documented CLI) to really be used in an infrastructure as code fashion.
I feel like there's still a big missing piece of the docker ecosystem, a really simple scheduler that can easily be stood up on new nodes and attach them to an existing cluster, and has a simply way of handling public IPs for web apps/haproxy containers. I know you can do this with K8s, DC/OS, etc. But there is a lot of prep work that has to be done first.
Well, that's a gross simplification of what Kubernetes is. (I don't know about Marathon.)
Kubernetes is a "choreographer" of cluster operations. At its core it's a consistent object store that contains a description of the state you want your cluster to be in. Various controllers monitor this store and try to "reconcile" the real world with the desired state. Operations include things like creating persistent volumes, setting up networking rules, and, of course, running applications. To say that it's a frontend for a container engine is a bit misleading, since Kubernetes can control so much more.
It's a nicely layered system — a "pod" describes the desired state of a single instance of an app, a "replica set" describes the desired state of a set of pods, a "deployment" describes the desired incremental rollout of a replica set, and so on. It's also a design that scales down to a single node (hence the popularity of Minikube as well as Docker for Mac, which includes Kubernetes), as well as up.
It's also a design that means that with a few exceptions, your configuration can target any Kubernetes cluster, not a specific cloud vendor. Without a single modification, I can deploy my app to the local Kubernetes on my laptop, or to our production cluster on Google Cloud. While migrating to Docker/Kubernetes took nearly a year, migrating away from GCP would take us probably less than a week (most of it involving pointing DNS to new load balancers, and moving persistent volumes over).
Beyond Google Kubernetes Engine and various other clouds (Azure is apparently very good), there's a bunch of tools now that do the heavy lifting of creating a cluster somewhere. Kubeadm and Kops are both popular.
Have you tried GKE? https://cloud.google.com/kubernetes-engine/
It abstracts away all the work of setting up a close to production HA cluster, so you can jump quickly into developing & deploying your app. You can start with 1 node and ask GKE to scale to N when you want it.
kubeadm can do that:
https://github.com/kelseyhightower/kubeadm-single-node-clust...
it basically comes down to bootstrapping it like normal and then removing the 'node-role.kubernetes.io/master' taint so that things can run on the master node.
The one area in kubeadm that is still being worked on is bootstrapping a HA cluster, but if you don't mind having a single master node, you can easily bootstrap a cluster and then add nodes to it later.
Kelsey's tutorial is a bit outdated (Oct 2 2017 with k8s v1.8, v1.11 just got released). Here is a link to the official kubeadm guide for Creating a single master cluster with kubeadm:
https://kubernetes.io/docs/setup/independent/create-cluster-...
It basically is just running
on a server / VM (after installing docker and kubeadm, of course). Add a pod network add-on (Calico seems to work well, 2 commands to install), remove the mentioned taint and optionally join more worker nodes (also a single kubeadm command). Every step is in the guide, just copy & paste. ;)Note: This is no production-ready cluster (it has a single master), also you should have some basic understanding of k8s, which the OP provides. I also highly recommend to dig around kubernetes.io/docs - good material there.
I started with kubeadm some days before the release of k8s v1.11, which made some stuff I wrote obsolete, oh well... :) I really like the new kubeadm phase stuff, though.
There is also an official guide for Creating Highly Available Clusters with kubeadm (it's updated for v1.11) which I just went through:
https://kubernetes.io/docs/setup/independent/high-availabili...
I opted for the "stacked masters" approach (run HA etcd right on the k8s master nodes), wrote some ansible tasks to automate to boring stuff like copying configs/certs etc., and am currently (re-)exploring add-ons and advanced functionality (helm, network policies, ingress controller, ceph via helm, ...).
Let's see how far I get this time!
https://docs.mist.io/article/119-kubernetes-getting-started-...
Disclosure: I'm one of the founders
To be fair, this stateful containers in general are a relatively new thing in K8s and support has been improving.
Also, K8s is trying to do and abstract away a lot, it is more like a distributed operating system by itself. So it is more complicated than swarm.
Are we doing it wrong?
We’re using hyperkube and k8s 1.8 which came out around q4 of last year.
Almost all of these I can trace back to user error (ie we told folks to do X, they didn’t, and stuff broke). We’re now having to write a preflight checklist of sorts that the app runs through to make sure A bunch of stuff is “ok.” That in itself becomes brittle in my experience so I’m reluctant to do that.
Even if you can’t use it for production, it’s highly worth setting up a prototype environment on GKE to see what it gives you. I believe they now have GPU support, including support for preemptible GPUs (much cheaper).
Also GKE does a good job of staying up to date with releases. They are on 1.10, with support expected soon for 1.11. They _fully_ manage the upgrade of both the Kubernetes and etcd masters, and the worker nodes.
As mentioned in the linked blog post, Kubernetes is a fast moving project, and to use it you should plan and allocate significant resources in your team to keeping up with the new releases. There are a large number of fixes and improvements since 1.8 and I would look very seriously at both upgrading, and changing your processes to allow you to stay closer to the current release version.
The Kubernetes project does not, and has no current plans to, have a long term support release.
We then ask to rerun the script. If it fails, we fix the script.
If you want to try a stripped down version on your local machine, kubernetes-core can be installed into LXD containers via conjure-up with a click of the fingers - nice for playing around.
If so, that will be harsh. I am not sure how to help you there. There are many things that could go wrong as you say.
If you do have control over the OS (you are providing an ISO or virtualization image) then it should mostly "just work". My company is doing a similar thing, only we ship boxes with everything pre-installed. There is another scenario for hybrid cloud, but even then they download a VMWare image.
Also if you do have control over the OS: can you use CoreOS instead? It is very well suited for running K8s and has less things that can go wrong. RedHat bought them anyway. With Ignition (or even the old-fashioned cloud init with a config drive), it is a no-touch deployment (you do have to generate and inject the certificates beforehand).
One thing that sounds weird is that you are "telling folks to do X". Can you avoid telling them anything and have it automated?
also stuff that you describe does not look like some user error. btw. you should not expose the deployment yaml/json to "users"/"developers". You should have a ci that just runs `kubectl set image deployment/name pod-name=IMAGE` and keep all deployment descriptors, etc in a seperate source repository.
We support provisioning servers, building overlay networks with Vxlan, BGP & Wireguard, distributed storage and rolling out things like service discovery, load balancers and HA.
It may be worth exploring for those struggling with some of the complexity around container deployments. At the minimum it will help you understand more about containers, networking and orchestration.
[1] https://www.flockport.com