Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OCI images in API #7203

Closed
vmarmol opened this issue Apr 23, 2015 · 59 comments
Closed

Support OCI images in API #7203

vmarmol opened this issue Apr 23, 2015 · 59 comments
Labels
area/api Indicates an issue on api area. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@vmarmol
Copy link
Contributor

vmarmol commented Apr 23, 2015

Today the API assumes Docker images, but we'd like to be able to support other images in the future. This issue specifically targets the ACI standard (which is used by Rkt)

Arguably having the flexibility to have a different image type in the API is for v1.0, although actually supporting a different image type is not.

/cc @dchen1107 @bgrant0607 @yifan-gu @jonboulle

@vmarmol vmarmol added area/api Indicates an issue on api area. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. area/rkt labels Apr 23, 2015
@bgrant0607 bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Apr 23, 2015
@thockin
Copy link
Member

thockin commented Apr 23, 2015

We have to decide whether the API should reflect the choice of runtime at
all. Ideally the API is agnostic to all that. Container image is just a
string that we pass down. I'm not sure we can do that, though. I really
dislike the idea of the end user having to change their behavior based on
the admin's choice of runtime.

It's a kubernetic ideal that the dame pod spec should run on a docker
cluster and on a rocket cluster.

On Wed, Apr 22, 2015 at 5:20 PM, Victor Marmol notifications@github.com
wrote:

Today the API assumes Docker images, but we'd like to be able to support
other images in the future. This issue specifically targets the ACI
https://github.com/appc/spec/blob/master/SPEC.md#app-container-image
standard (which is used by Rkt https://github.com/coreos/rkt)

Arguably having the flexibility to have a different image type in the API
is for v1.0, although actually supporting a different image type is not.

/cc @dchen1107 https://github.com/dchen1107 @bgrant0607
https://github.com/bgrant0607 @yifan-gu https://github.com/yifan-gu
@jonboulle https://github.com/jonboulle


Reply to this email directly or view it on GitHub
#7203.

@bgrant0607
Copy link
Member

Other example runtimes / formats:

  • Fabric8 Java
  • LXC + some image format (targz?)
  • Google internal
  • Non-Linux platforms: BSD, Solaris, Windows, ...

Eventually we'll need some kind of plugin API in the pod spec.

Ideally, when running in a cloud, the system would eventually be able to instantiate VMs capable of running whatever image/spec was provided, as necessary.

I agree that differences in semantics should not be implicit based on implementation choices, and the pod spec should not need to be changed in the case that the same semantics are desired on different runtimes.

@thockin
Copy link
Member

thockin commented Apr 23, 2015

By way of example, I would be very against a solution that required

image: nginx

on docker runtimes but required

image: docker://nginx

on rkt

On Wed, Apr 22, 2015 at 10:03 PM, Brian Grant notifications@github.com
wrote:

Other example runtimes / formats:

  • Fabric8 Java
  • LXC + some image format (targz?)
  • Google internal
  • Non-Linux platforms: BSD, Solaris, Windows, ...

Eventually we'll need some kind of plugin API in the pod spec.

Ideally, when running in a cloud, the system would eventually be able to
instantiate VMs capable of running whatever image/spec was provided, as
necessary.

I agree that differences in semantics should not be implicit based on
implementation choices, and the pod spec should not need to be changed in
the case that the same semantics are desired on different runtimes.


Reply to this email directly or view it on GitHub
#7203 (comment)
.

@bgrant0607
Copy link
Member

Agree, though requiring a prefix for all images would be a breaking change. We could do that in v1 and add a format field in the older APIs.

However, with respect to images, there are multiple issues:

  • name/address format
  • data format
  • image identity, both mutable (tags) and immutable (content-based id)
  • location to fetch from

I haven't had time to think about which we want to explicitly represent. In enterprise environments, admins generally want control over the latter 2.

@vmarmol
Copy link
Contributor Author

vmarmol commented Apr 23, 2015

In the short term, ACI-based images do natively support Docker images so no changes would need to be made to express Docker images. For others we would have to identify and know where to schedule them. Initially it'd be an explicit choice on the user's part I imagine.

If we did have to schedule only on nodes that supported format X, parsing the image string is probably not the best way.

@jonboulle
Copy link
Contributor

@vmarmol:

ACI-based images do natively support Docker images

"ACI-based images" -> "ACI-based runtimes"? Even so, not strictly correct; rkt natively supports Docker images (by transparently converting them to ACIs using docker2aci), but this isn't a provision of the spec (i.e. other implementations might not have this support).

@jonboulle
Copy link
Contributor

A little more detail on the ACI side:

It is definitely a different beast to Docker in that there's no shorthand way to deterministically [1] reference remote images (in part since there's no registry API). Instead, the appc spec currently defines two processes (strategies) for discovering image locations [2], where "location" is an HTTP endpoint, torrent, HDFS URI, etc; but these strategies are not necessarily deterministic, depending on their implementation (essentially this is the "mutable" identity bgrant0607 refers to above). The only canonical image reference is instead the image ID (the immutable, content-based identity).

Taking rkt as an example again, it currently supports three image reference types:
a) a discovery reference
b) a direct URI (either HTTP/S or a file on local disk)
c) an image ID (at the moment this must be already locally cached in rkt's CAS, but you can imagine an implementation where it is available in a well-known remote CAS as well)

[1] For some definition of deterministic; obviously Docker repositories can still be mutated behind the scenes. But there is a general assumption in the ecosystem today that "nginx:14.04" or "foo.com/bar:baz" refer to a consistent image over time. (I know Docker have just added content-based tags to provide this)
[2] Two ordered processes, but soon to be reframed as a pluggable set of strategies

@philips
Copy link
Contributor

philips commented Apr 28, 2015

I think where we need to end up on this is having the API be very explicit about the users intended image. This is something like how the volume API is explicit about the type of volume and its arguments:

  manifest:
    containers:
      - image: kubernetes/pause
        name: testpd
        volumeMounts:
          - mountPath: "/testpd"
            name: "testpd"
    id: testpd
    version: v1beta1
    volumes:
      - name: testpd
        source:
          persistentDisk:
            # This GCE PD must already exist.
            pdName: test
            fsType: ext4

So if we take this same approach and change the image we get something like:

docker images

  manifest:
    containers:
      - name: testpd
        image:
            docker:
                repository: kubernetes/pause
                tag: latest
                digest: sha256:cbbf2...
        volumeMounts:
          - mountPath: "/testpd"
            name: "testpd"

ACIs

  manifest:
    containers:
      - name: testpd
        image:
            aci:
                name: kubernetes.io/pause
                labels:
                   version: 2.0.9
                id: sha512-a83b3f...
        volumeMounts:
          - mountPath: "/testpd"
            name: "testpd"

Possible problems and points of research:

@philips
Copy link
Contributor

philips commented Apr 28, 2015

On @bgrant0607 point of letting the user control where things are fetched from this feels like a configuration problem that is orthogonal to this problem. If the user wants to configure their container runtime to fetch from an NFS share, local store, etc we need to have a cluster-wide or namespace wide API for this sort of thing.

@pmorie
Copy link
Member

pmorie commented Apr 28, 2015

@philips Presumably there will be a validation to ensure that each container in a pod has the same image type, correct?

@philips
Copy link
Contributor

philips commented Apr 28, 2015

@pmorie I don't know if there is a technical reason for this restriction; rkt, for example, can run both docker and ACI images in the same pod.

@pmorie
Copy link
Member

pmorie commented Apr 28, 2015

@philips That is a really good point -- I guess if rocket is available there shouldn't be any problem with letting the rocket pod handle the docker image pull. That said my gut says there may be some issues to work through -- for example making sure that rkt and docker can pull using the same secret.

@philips
Copy link
Contributor

philips commented Apr 28, 2015

On Tue, Apr 28, 2015 at 1:30 PM, Paul Morie notifications@github.com
wrote:

That said my gut says there may be some issues to work through -- for
example making sure that rkt and docker can pull using the same secret

It should be easily possible:
https://github.com/coreos/rkt/blob/master/Documentation/configuration.md#configuration-kinds

I think @yifan-gu is hooking up the k8s secret auth configuration to this
rkt feature for docker registry auth.

Brandon

@pmorie
Copy link
Member

pmorie commented Apr 28, 2015

@philips Yep, we've been communicating about that.

@bgrant0607
Copy link
Member

@philips The fields you propose match the ACI pod manifest schema? What structure would you propose for Docker images?

@philips
Copy link
Contributor

philips commented Apr 28, 2015

@bgrant0607 The first one is for docker images, here it is again:

  manifest:
    containers:
      - name: testpd
        image:
            docker:
                repository: kubernetes/pause
                tag: latest
                digest: sha256:cbbf2...
        volumeMounts:
          - mountPath: "/testpd"
            name: "testpd"

This is based on the docker image json and the new Docker 1.6 namespace/repo:tag@digest schema which I feel is getting us into a place where the string is too fancy and should be pulled out into separate sections.

@ncdc
Copy link
Member

ncdc commented May 6, 2015

Note, the digest format for Docker 1.6 is namespace/repo@digest - :tag is not part of it when using a digest, and you can only specify either a tag or a digest, but not both.

@pmorie
Copy link
Member

pmorie commented May 6, 2015

There is not a one-to-one mapping between container runtime and image pull. Rocket supports running docker images, and so it makes sense that you could run a pod that has a docker image spec with a rocket container runtime with no changes. In the future, the ACI format might be supported by docker, in which case the opposite would be possible: run a pod with an ACI image spec with the rkt container runtime.

@pmorie
Copy link
Member

pmorie commented May 6, 2015

Let me offer the following api skeleton for discussion:

type Container struct {
  // other fields omitted
  Image ImageSpec
}

type ImageSpec struct {
    DockerV1 *DockerV1RegistryImageSpec
    ACI      *ACISpec
    // FUTURE: other formats (fabric8, lxc, etc)
}

The image spec ought to be orthogonal from the container runtime. That said, eventually I think the node should inform the cluster of which image specs it can support and the scheduler will have to ensure that pods get scheduled onto a node that supports the image format.

@thockin
Copy link
Member

thockin commented May 8, 2015

This is another place where a discriminated overlayed one-of type would
make a good API. #6979

    "image": {
        "kind": "aci",
        "name": "example.com/foobar-worker",
        "id": "afe264840302"
    }

vs

    "image": {
        "kind": "docker",
        "name": "gcr.io/google_containers/foobar-worker:1.2"
    }

On Wed, May 6, 2015 at 11:01 AM, Paul Morie notifications@github.com
wrote:

Let me offer the following api skeleton for discussion:

type Container struct {
// other fields omitted
Image ImageSpec
}
type ImageSpec struct {
DockerV1 *DockerV1RegistryImageSpec
ACI *ACISpec
// FUTURE: other formats (fabric8, lxc, etc)
}

The image spec ought to be orthogonal from the container runtime. That
said, eventually I think the node should inform the cluster of which image
specs it can support and the scheduler will have to ensure that pods get
scheduled onto a node that supports the image format.


Reply to this email directly or view it on GitHub
#7203 (comment)
.

@philips
Copy link
Contributor

philips commented May 8, 2015

I agree with @thockin's suggestion here.

@thockin
Copy link
Member

thockin commented May 8, 2015

Sadly, we do not have currently Go infrastructure for decoding JSON this
way, unless you guys have some tricks you want to share... We have a
sketch of an idea of autogenerating unmarshalling code. Nothing further.

On Fri, May 8, 2015 at 10:49 AM, Brandon Philips notifications@github.com
wrote:

I agree with @thockin https://github.com/thockin's suggestion here.


Reply to this email directly or view it on GitHub
#7203 (comment)
.

@runcom
Copy link
Contributor

runcom commented Sep 14, 2016

any movement on this one? is there a consensus to have a kind field to express which kind of image are we dealing with?

I'm asking because the Container Runtime Interface (#17048) just receives an anonymous ImageSpec which defaults to be a docker image - with docker image I'm saying docker image reference host[:port]/namespace/name[:tag|@digest] + to be downloaded from a docker registry.

But in https://github.com/containers/image, we developed a Transport type which identifies an image and how it should be fetched/pushed. Currently we support:

  • push/pull from/to Docker registries
  • push/pull from/to the Atomic Registry in OpenShift
  • push to a local directory following the Opencontainers Image Layout spec
  • push/pull layers and the manifest to/from a local directory

It basically consists of a a kind prefix which identifies the kind of the image we're dealing with:

  • atomic:
  • docker://
  • oci:
  • dir:

Plus an URI like string which is reference dependent (i.e. docker follows its own, OCI and dir their own as well and they're just paths dir:/local/path).

I've been thinking it would be great to have a discriminant at least (kind) so any CRI implementors can decide if they support the given image kind and not just assume it's a Docker image. The Transport can be just a path to a nfs mounted directory instead of pulling anything in some deployments.

/cc @philips @thockin et all

@philips
Copy link
Contributor

philips commented Sep 15, 2016

@runcom step 1 is to get a proposal for the API change. There is an old proposal that could be used, see the feature for the old proposal link: kubernetes/enhancements#63

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 20, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 19, 2018
@bgrant0607 bgrant0607 changed the title Support ACI images in API Support OCI images in API Jan 22, 2018
@bgrant0607
Copy link
Member

/remove-lifecycle rotten
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jan 22, 2018
runcom added a commit to runcom/kubernetes that referenced this issue Jan 29, 2018
This patch fixes a regression introduced by
kubernetes#51751 in the CRI
interface.
That commit actually changed a unit test where we were previously *not*
assuming anything about an image name.
Before that commit, if you send the image "busybox" through the CRI,
the container runtime receives "busybox". After that patch the
container runtime gets "docker.io/library/busybox".
While that may be correct for the internal kube dockershim, in the CRI
we must not assume anything about image names. The ImageSpec is not
providing any spec around the image so the container runtime should
just get the raw image name from the pod spec. Every container runtime
can handle image names the way it wants. The "docker.io" namespace is
not at all "standard", CRI-O is not following what the docker UI say
since that's the docker UI. We should not focus the CRI on wrong UI
design, especially around a default namespace.

ImageSpec is not standardized yet:
kubernetes#46255 and
kubernetes#7203

This is something which should land in 1.9 as well since the regression
is from 1.8.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
k8s-github-robot pushed a commit that referenced this issue Jan 29, 2018
Automatic merge from submit-queue (batch tested with PRs 58955, 58968, 58971, 58963, 58298). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

pkg: kubelet: do not assume anything about images names

This patch fixes a regression introduced by
#51751 in the CRI
interface.
That commit actually changed a unit test where we were previously *not*
assuming anything about an image name.
Before that commit, if you send the image "busybox" through the CRI,
the container runtime receives "busybox". After that patch the
container runtime gets "docker.io/library/busybox".
While that may be correct for the internal kube dockershim, in the CRI
we must not assume anything about image names. The ImageSpec is not
providing any spec around the image so the container runtime should
just get the raw image name from the pod spec. Every container runtime
can handle image names the way it wants. The "docker.io" namespace is
not at all "standard", CRI-O is not following what the docker UI say
since that's the docker UI. We should not focus the CRI on wrong UI
design, especially around a default namespace.
Image name normalization is a Docker implementation detail around short images names, not the CRI. 

ImageSpec is not standardized yet:
#46255 and
#7203

This is something which should land in 1.9 as well since the regression
is from 1.8.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
Fix regression in the CRI: do not add a default hostname on short image names
```
iwankgb pushed a commit to iwankgb/kubernetes that referenced this issue Feb 15, 2018
This patch fixes a regression introduced by
kubernetes#51751 in the CRI
interface.
That commit actually changed a unit test where we were previously *not*
assuming anything about an image name.
Before that commit, if you send the image "busybox" through the CRI,
the container runtime receives "busybox". After that patch the
container runtime gets "docker.io/library/busybox".
While that may be correct for the internal kube dockershim, in the CRI
we must not assume anything about image names. The ImageSpec is not
providing any spec around the image so the container runtime should
just get the raw image name from the pod spec. Every container runtime
can handle image names the way it wants. The "docker.io" namespace is
not at all "standard", CRI-O is not following what the docker UI say
since that's the docker UI. We should not focus the CRI on wrong UI
design, especially around a default namespace.

ImageSpec is not standardized yet:
kubernetes#46255 and
kubernetes#7203

This is something which should land in 1.9 as well since the regression
is from 1.8.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
@Fak3
Copy link

Fak3 commented Oct 18, 2018

any progress on this?

@dims
Copy link
Member

dims commented Oct 18, 2018

long-term-issue (note to self)

@bgrant0607
Copy link
Member

This is obsolete, but I'll leave a ref to #1697
/close

@k8s-ci-robot
Copy link
Contributor

@bgrant0607: Closing this issue.

In response to this:

This is obsolete, but I'll leave a ref to #1697
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests