Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Real container ssh #1513

Closed
bgrant0607 opened this issue Sep 30, 2014 · 80 comments
Closed

Real container ssh #1513

bgrant0607 opened this issue Sep 30, 2014 · 80 comments
Labels
area/introspection area/security area/usability priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@bgrant0607
Copy link
Member

Forking ssh off of #386.

To debug a containerized application, currently, a user can ssh to the host and then use nsenter/nsinit or docker exec to enter the container, or they could run their own sshd inside their containers or pods, which we'd like them to not need to do. For non-interactive commands they could use kubelet's RunInContainer API, but there will be cases where the user wants an interactive shell or debugging session. For that, we should facilitate ssh directly into a container.

I don't think we'd implement ssh support exactly the same way, but geard implements a neat trick in order to support ssh directly to a container. I looked into how it did this a few months ago.

geard will generate ssh keys for containers upon request, and put them into /var/lib/containers/keys.

In addition to the keys themselves, geard injects its namespace entering command:
command=\"/usr/bin/switchns\"

switchns finds the container with the same name as the "user" and starts bash in the container using the equivalent of docker exec.

/etc/ssh/sshd_config needs to be changed to add a geard-specific key lookup command:

AuthorizedKeysCommand /usr/sbin/gear-auth-keys-command

This enables you to do:

ssh ctr-mycontainer@myvm

This mechanism was intended for lookup of users and their keys in remote services, such as LDAP.

Relevant assumptions for our solution:

The problem with ssh <user-within-namespace>@<IP-of-pod> is that the pod could have multiple containers.

I haven't done enough investigation to make a proposal as yet, but I wanted to record this feature request.

/cc @smarterclayton @erictune

@smarterclayton
Copy link
Contributor

We've got some iterations being proposed as well that extend the idea of ssh user name mapping to an arbitrary scope.

First, we want to make the security integration cleaner - sshd limits us to real users on the system and SELinux further restricts how content can be organized. A short term solution is going to be to make authn info (keys, passwords, etc) easier to retrieve on demand from the masters (on demand because ssh is rare and we can delay access a bit). A longer solution is to either patch sshd like GitHub did, use sssd more heavily, or drop in a separate sshd implementation like the go.ssh library. All have drawbacks of course.

A second (parallel) step would be to move the authenticating ssh proxy part off the host, and have it handle all the details of mapping the incoming user and the ssh path (for git and others) into a valid destination, and then establish an ssh forward or other action to the remote server under a different set of credentials. The pattern is the same ("fake" user representing an entity, ssh key command with generated keys and a step command, but we'd be able to concentrate authn and authz decisions. The backend servers' sshd would then deal with one or two users representing roles and tasks (gitreader, logpuller) instead of having the full set of users. The private keys for those roles would live on the authenticating ssh proxy. The end goal would be for the ssh user to the proxy be something like a project or namespace (auth is checked against users/keys related to that project) and then to be able to use the path to further distinguish the target of the operation (/pods/XYZ/containerName). That also lets us control fan out of keys more tightly.

Everything in the second step is useful even without it being in a proxy (standardizing a set of tools around ssh to allow this), it's just easier to manage.

On Sep 30, 2014, at 1:27 PM, bgrant0607 notifications@github.com wrote:

Forking ssh off of #386.

To debug a containerized application, currently, a user can ssh to the host and then use nsenter/nsinit or docker exec to enter the container, or they could run their own sshd inside their containers or pods, which we'd like them to not need to do. For non-interactive commands they could use kubelet's RunInContainer API, but there will be cases where the user wants an interactive shell or debugging session. For that, we should facilitate ssh directly into a container.

I don't think we'd implement ssh support exactly the same way, but geard implements a neat trick in order to support ssh directly to a container. I looked into how it did this a few months ago.

geard will generate ssh keys for containers upon request, and put them into /var/lib/containers/keys.

In addition to the keys themselves, geard injects its namespace entering command:
command="/usr/bin/switchns"

switchns finds the container with the same name as the "user" and starts bash in the container using the equivalent of docker exec.

/etc/ssh/sshd_config needs to be changed to add a geard-specific key lookup command:

AuthorizedKeysCommand /usr/sbin/gear-auth-keys-command
This enables you to do:

ssh ctr-mycontainer@myvm
This mechanism was intended for lookup of users and their keys in remote services, such as LDAP.

Relevant assumptions for our solution:

It should look like standard ssh as much as possible, as per the proposal in #386
I assume we'll be supporting user namespaces at some point.
IP per pod
Potentially multiple containers within the pod
The problem with ssh @ is that the pod could have multiple containers.

I haven't done enough investigation to make a proposal as yet, but I wanted to record this feature request.

/cc @smarterclayton @erictune


Reply to this email directly or view it on GitHub.

@dchen1107
Copy link
Member

@bgrant0607 why not ssh ctr-mycontainer@ or ssh ctr-mycontainer@pod-name, instead of ssh ctr-mycontainer@myvm?

@erictune
Copy link
Member

We haven't talked about what unix UIDs we will have for pods once the system is multiuser. I suggest dynamically allocation and garbage collection by the kubelet. The AuthorizedKeysCommand will need to be able to discover the container -> allocated rootns_uid mapping.

@erictune
Copy link
Member

It gets complicated, but users would really like to be able to SSH into the environment of a container which has previously exited in failure. They want to be able to poke around in the same filesystem view (including Volumes) that the failed container had. There are a couple of ways you could accomplish this.

@erictune
Copy link
Member

So, we have 4 or so values that we might want to put into two slots (user@host):

  • username
  • container name or UID
  • pod name or UID
  • pod IP (possibly the same as pod name if we end up using DNS for pod names, which I know is not decided)
  • VM hostname or IP

I agree with Dawn that we should not waste one of those slots on the VM name. VMs are not the thing that kubernetes users should think about.

I recognize that there may be several containers per pod, but I don't think that will be the common case. We should consider making things easy in the common case. Remembering a pod name is easier than remembering a container name, since the former is a REST resource and the latter is not. Some options:

  • use -o SendEnv=K8S_SSH_TO_CONTAINER to specify a container name held in a local enviroment variable. The AuthorizedKeysCommand should see this env var, I'm assuming, and run the switchns so as to enter that container.
  • a kubessh script could wrap this to set the env var and to show a menu or helpful error message in the case when there are multiple containers in a pod.
  • Either reject connection or default to first container on the list in the case where no container specified.

@bgrant0607
Copy link
Member Author

The ability to poke around a terminated container is also needed for PostStop hooks (#140).

@smarterclayton
Copy link
Contributor

Another thing we've discussed - supporting "not quite SSH" as remote execution of Bash via remote docker exec - essentially proxy STDIN/STDOUT/STDERR via web sockets or SPDY (as per libchan) from a bastion host somewhere else in the cluster or from the API server. In that case the shell is really just a slightly more complicated use case (requires tty, latency worries) on top of simpler things like "ps -ef", "top", "du", "tar cvz " that would allow singular or bulk execution of a command into many containers at once. It's much easier to do things with that form of remote access like grant someone a token that allows execution of a very specific command (this token allows you to run du on these pods) for integration. It's almost a separate problem, although you could integrate it into the authenticating ssh proxy if necessary.

----- Original Message -----

The ability to poke around a terminated container is also needed for PostStop
hooks (#140).


Reply to this email directly or view it on GitHub:
#1513 (comment)

@thockin
Copy link
Member

thockin commented Oct 1, 2014

Do we have anyone on-team who is looking at this area in detail? I think
it's a pretty exciting opportunity to do some cool work.

On Wed, Oct 1, 2014 at 8:42 AM, Clayton Coleman notifications@github.com
wrote:

Another thing we've discussed - supporting "not quite SSH" as remote
execution of Bash via remote docker exec - essentially proxy
STDIN/STDOUT/STDERR via web sockets or SPDY (as per libchan) from a bastion
host somewhere else in the cluster or from the API server. In that case the
shell is really just a slightly more complicated use case (requires tty,
latency worries) on top of simpler things like "ps -ef", "top", "du", "tar
cvz " that would allow singular or bulk execution of a command
into many containers at once. It's much easier to do things with that form
of remote access like grant someone a token that allows execution of a very
specific command (this token allows you to run du on these pods) for
integration. It's almost a separate problem, although you could integrate
it into the authenticating ssh proxy if necessary.

----- Original Message -----

The ability to poke around a terminated container is also needed for
PostStop
hooks (#140).


Reply to this email directly or view it on GitHub:

#1513 (comment)

Reply to this email directly or view it on GitHub
#1513 (comment)
.

@bgrant0607
Copy link
Member Author

See also #1521 (attach).

@bgrant0607
Copy link
Member Author

I think we're going to need a bastion host, in general, for making it easier for users to connect to their applications (e.g., browsing /_status http pages) without giving everything external IPs.

@smarterclayton
Copy link
Contributor

And ultimately I'd like to expose port forwarding as transient external ports with some gating authentication, which could be SSH bastion or starting a TLS tunnel with a one time public/private key pair via the API. The "join my local system into the pod network namespace" is really important when debugging an app outside the cluster. You could also do that via a one-off pod running in your network space that exposes an SSH server that the bastion host will then forward to (which gives you true SSH from start to end). Having the bastion host makes audit that much easier at the expense of some concentration of traffic (i.e. the guy downloading 12gb of database snapshots via your bastion).

----- Original Message -----

I think we're going to need a bastion host, in general, for making it easier
for users to connect to their applications (e.g., browsing /_status http
pages) without giving everything external IPs.


Reply to this email directly or view it on GitHub:
#1513 (comment)

@bgrant0607 bgrant0607 added this to the v1.0 milestone Oct 4, 2014
@ncdc
Copy link
Member

ncdc commented Oct 8, 2014

FYI I've been starting some R&D on an SSH Proxy that will support multiple backends (e.g. SSH to container, Git). The doc is really rough right now, but please feel free to comment: https://docs.google.com/document/d/1Ehx28vQzIJ2L6DoX99JZOzHEHcPCvMCH38WxBwGjc7E

Edit: the doc got DDOS'd. If you'd like access, please request it.

@ncdc
Copy link
Member

ncdc commented Nov 10, 2014

To follow up on 1 of @smarterclayton's prior comments (#1513 (comment)), we think that the best approach to allowing a user to get shell access to a container, as well as port forwarding a container's private ports back to a user (à la ssh -L), is a solution comprised of:

  • an authenticating proxy that sits at the edge (DMZ)
  • a docker exec proxy in the minion

The authenticating proxy could be running sshd, but this will provide a lesser user experience (see below); it is better if the authenticating proxy accepts TLS/SPDY/HTTP requests from a client designed specifically for this purpose.

A user makes a request (e.g. kubectl exec $container bash) to the authenticating proxy, which validates the user's supplied credentials. These credentials should be in a header so the authenticating proxy can forward them to the docker exec proxy later. The authenticating proxy checks with the master to see if the user is authorized to access $container. If so, it queries the master to determine on which minion $container is located. Next, it opens a new connection to the exec proxy on the appropriate minion. The exec proxy performs an additional authentication/authorization check, then runs docker exec with the $command specified by the user in the appropriate $container. It connects stdin/stdout/stderr from the docker exec call to the streams from the authenticating proxy's connection, which forwards them back to the user.

Port forwarding

When a user makes a request such as kubectl portforward $container --local 8888 --remote 8080 to forward the container's port 8080 locally as port 8888, the authenticating proxy executes a utility such as netcat or socat to connect the appropriate internal port in the container back to the local port on the user's computer.

Caveats:

  • the utility process needs to exist in the container's image
  • if a user wants to run a single command to forward multiple container ports to multiple local ports, we would need to establish a means of multiplexing the traffic over a single connection (presumably something like libchan and/or SPDY can help here)

SSH support

Supporting real SSH clients can be achieved by implementing a second authenticating proxy that runs sshd. Because the proxy sits in between the client and the backend container, a user won't be able to simply ssh $user@proxy as that connection URL doesn't contain the coordinates to the backend container. The user must instead do something along the lines of ssh $user@proxy $container $command, and the proxy evaluates the request and forwards it to the appropriate $container.

The SSH proxy would terminate the SSH connection itself, query the master to determine the appropriate minion for $container, and then follow the same steps described above to use the docker exec proxy to connect to $container.

The SSH proxy ideally should support the variety of authentication mechanisms available to SSH: password (token), public key, Kerberos, etc. Once the SSH proxy authenticates the user, it needs to be able to connect to the docker exec proxy on the minion as the remote user. After authentication succeeds in sshd, it no longer has the user's "secrets" to be able to securely authenticate to the exec proxy. To work around this, a custom PAM module can be written and added to the sshd process's session stack to retrieve a unique token for the user from the master and store it in the child process's environment. The SSH proxy can use this environment variable when authenticating on behalf of the user to the exec proxy. As long as the sshd child process has a unique MCS label and execution context, other processes in the proxy won't be able to snoop its environment variables.

@erictune
Copy link
Member

Does the socat or equivalent utility process have to run inside the container?
What if it runs in the root namespace and connects to $PODIP?

It seems like that would remove the need for socat in the container.
It also removes the need to support a variety of tools (socat, netcat).

@ncdc
Copy link
Member

ncdc commented Nov 10, 2014

As long as the utility process can connect to the pod's IP and access its internal ports, I'm fine with it running outside the container. It may need to join the pod's network namespace, unless there's some way to do it without going that route?

@smarterclayton
Copy link
Contributor

Wonder if the kube proxy should/could fulfill part of this role (exec proxy on nodes). Auth check should be a token lookup (we don't have an endpoint that tells you that have the right to do an action). Proxy has to be high traffic (people will cat multi gig files across stdin). Proxy has to be fair. Sounds a bit less like kube proxy.

@erictune
Copy link
Member

Revising what I said above. The utility process should run in the container so its CPU usage is capped by whatever limits the CPU usage of the container.

If we use docker exec to start the utility process, then I think it has to be visible in the containers chroot. Maybe we start all containers with a volume that includes a statically linked utility process?
If we are willing to maintain our own binary we can make one that starts outside the container, and then enters it.

@smarterclayton
Copy link
Contributor

What if you want to debug a container that is pegged at 100% cpu? Agree most times you want to be in the containers cgroup, but I can think of a few exceptions.

----- Original Message -----

Revising what I said above. The utility process should run in the container
so its CPU usage is capped by whatever limits the CPU usage of the
container.

If we use docker exec to start the utility process, then I think it has to be
visible in the containers chroot. Maybe we start all containers with a
volume that includes a statically linked utility process?
If we are willing to maintain our own binary we can make one that starts
outside the container, and then enters it.


Reply to this email directly or view it on GitHub:
#1513 (comment)

@erictune
Copy link
Member

If you are the cluster admin, then you get to use regular SSH to the node to debug.
If you are an unprivileged pod owner in a multitenant environment, then you can:

  • try to first send a request to grow the pod to be bigger than the cpu demand.
  • deal with it.

If you let a non-admin run outside a cgroup, then he may try to e.g. open a huge file in an editor and make the whole machine OOM. At which point, it is unpredictable what gets killed, but maybe an container belonging to an innocent bystander.

@erictune
Copy link
Member

Sorry if that came off as too strongly opinionated.

It's a topic that I have debated at great length internally, but it certainly deserves discussion again in the context of Kubernetes. Consider my last comment as my current position on the issue but open to new arguments 😄

@smarterclayton
Copy link
Contributor

I actually think it's reasonable. 100% cpu is unusual, people doing stupid things in ssh are not.

Ideally you want "press a button, get sshd" within a given container. It's a shame sshd can't easily be injected into an arbitrary container. We have considered a go-ssh daemon that accepts a single connection with a custom startup flow as well.

On Nov 11, 2014, at 6:19 PM, Eric Tune notifications@github.com wrote:

Sorry if that came off as too strongly opinionated.

It's a topic that I have debated at great length internally, but it certainly deserves discussion again in the context of Kubernetes. Consider my last comment as my current position on the issue but open to new arguments


Reply to this email directly or view it on GitHub.

@smarterclayton
Copy link
Contributor

Would like to do a deep dive on this next week for sure.

On Nov 25, 2014, at 5:14 PM, Eric Tune notifications@github.com wrote:

I didn't have any concrete in mind -- I was just thinking about a bunch of alternative implementations.


Reply to this email directly or view it on GitHub.

@erictune
Copy link
Member

So, can you @smarterclayton more precisely describe on your "Avoid running a proxy process on the node that has access to all containers" requirement?

@smarterclayton
Copy link
Contributor

Hrm, I don't know that is a requirement. I'd prefer that process is in a container for better management of versioning and at least some isolation from the infra, but as we've discussed I don't think it's a hard reqt. Part of my desire is to isolate the admin path from the consumer path so that you have a fallback to debug issues and a separate security path.

On Nov 25, 2014, at 5:15 PM, Eric Tune notifications@github.com wrote:

So, can you @smarterclayton more precisely describe on your "Avoid running a proxy process on the node that has access to all containers" requirement?


Reply to this email directly or view it on GitHub.

@bgrant0607 bgrant0607 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Dec 3, 2014
@ncdc
Copy link
Member

ncdc commented Dec 16, 2014

Here's a proposal I've been working on for a SPDY-based approach to run commands in containers, which includes getting shell access. It also describes how to implement port forwarding.

openshift/origin#576

@bgrant0607 bgrant0607 added status/help-wanted priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Jan 10, 2015
@goltermann goltermann removed this from the v1.0 milestone Feb 6, 2015
@bgrant0607 bgrant0607 removed this from the v1.0 milestone Feb 6, 2015
@bgrant0607 bgrant0607 added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Feb 28, 2015
@smarterclayton
Copy link
Contributor

Now that exec has landed and is looking pretty stable, some followups to look at:

Variants on nsenter that are more aligned with the container

We've talked before about extending nsinit from docker to be a true "execute in running Docker container" process, like rockets' equivalent.

Taking the proxies out of the loop (or at least, the kubelet proxy)

Instead of using nsenter to run the user process, use nsenter to start a process in the container with a predetermined port and secret that would listen for a subsequent connection. The kubelet could then redirect the response back to the caller with the port and location, and the caller can connect directly to the process. The stub process would then do the necessary exec to launch the child and clean up after itself. Important part is that this would make the network I/O part of the container's cgroup, not the kubelets.

Pattern for launching SSHD in a container via exec

Launch SSHD via exec, then terminate the exec connection and connect from the client directly to a port. Might be a simple package you could install into your container.

On demand port proxy

Be able to request using your credentials that a new, random port be exposed through the cluster so you can access a container from anywhere that can reach a master. The port might only accept one connection or require a specific source IP

@thockin
Copy link
Member

thockin commented Jun 4, 2015

Should this bug be closed now?

@smarterclayton
Copy link
Contributor

I think as a bug it's done. The last comment from me is next steps and it's not a priority right now.

@bgrant0607
Copy link
Member Author

We should keep this open until we capture that in a doc or another issue.

@nehayward
Copy link

What's the process to setup a container on kubernetes and then ssh to it? I have minikube running on my computer as well as kubernetes running on gce. But haven't been able to ssh to a deployed pod.

@thockin
Copy link
Member

thockin commented Aug 8, 2016

see kubectl exec

@nehayward
Copy link

@thockin thanks for the quick response. I've been able to do that, but I need to be able to ssh to it. As I'm trying to Perfkit on kubernetes.

@thockin
Copy link
Member

thockin commented Aug 8, 2016

I don't have any experience setting up "real" SSH in a container

On Mon, Aug 8, 2016 at 10:57 AM, Nick Hayward notifications@github.com
wrote:

@thockin https://github.com/thockin thanks for the quick response. I've
been able to do that, but I need to be able to ssh to it. As I'm trying to
Perfkit on kubernetes.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1513 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVMZ_MwARguTM753Y-Yo3DJai7VEJks5qd24PgaJpZM4CpHsv
.

@nehayward
Copy link

This is the closest thing I've found so far. https://docs.docker.com/engine/examples/running_ssh_service/

@thockin
Copy link
Member

thockin commented Aug 8, 2016

For the record, it's generally an anti-pattern to run real SSH in a
container. That doesn't mean you can't, just that you might want to think
about finding a way forward :)

On Mon, Aug 8, 2016 at 11:03 AM, Nick Hayward notifications@github.com
wrote:

This is the closest thing I've found so far. https://docs.docker.com/
engine/examples/running_ssh_service/


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1513 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVDO4P8P93zGUJOL-m22KYjXR0O96ks5qd29pgaJpZM4CpHsv
.

@nehayward
Copy link

@s1113950
Copy link

@nehayward if you'd like to ssh to your container, here's how we do it:
https://gist.github.com/s1113950/79b29d6ae82184b15d538e3cfe6a5439

It's an anti-pattern to do it this way, but sometimes all you need is an environment where you can run your code against, and you'd like to have kubernetes manage that environment :)
SSH also allows for things like remote code syncing via Pycharm!

@kubeoperater
Copy link

k8s dashboard 实现了ssh,但是也没有在docker里开ssh,dashboard的源码看一下,是不是调用的kubelet某些我们不知道的接口?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/introspection area/security area/usability priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests