目录

k8s的业务pod环境变量envs覆盖k8s插件devicplugin设置的envs

业务pod的envs如果跟业务平台的预置envs相同,则会引导致业务平台的保留envs被覆盖掉。

背景

场景:如果k8s的插件,如:NVIDIA-GPU插件对业务pod设置了环境变量信息envs(不是插件pod本身的envs),但用户创建业务pod中container也设置了同名envs,则会覆盖掉插件对业务pod设置的envs。

说明:业务pod的envs如果跟业务平台的预置envs相同,则会引导致业务平台的保留envs被覆盖掉。

k8s流程分析

梳理了k8s1.20.5版本的容器相关流程,大概分析下:

startContainer

启动容器

1
2
3
4
5
6
7
// startContainer starts a container and returns a message indicates why it is failed on error.
// It starts the container through the following steps:
// * pull the image
// * create the container
// * start the container
// * run the post start lifecycle hooks (if applicable)
func (m *kubeGenericRuntimeManager) startContainer

RunContainerOptions

环境变量envs属于RunContainerOptions

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// RunContainerOptions specify the options which are necessary for running containers
type RunContainerOptions struct {
	// The environment variables list.
    // 环境变量
	Envs []EnvVar
	// The mounts for the containers.
    // 挂载配置
	Mounts []Mount
	// The host devices mapped into the containers.
    // 资源设备
	Devices []DeviceInfo
	// The annotations for the container
	// These annotations are generated by other components (i.e.,
	// not users). Currently, only device plugins populate the annotations.
	Annotations []Annotation
	// If the container has specified the TerminationMessagePath, then
	// this directory will be used to create and mount the log file to
	// container.TerminationMessagePath
	PodContainerDir string
	// The type of container rootfs
	ReadOnly bool
	// hostname for pod containers
	Hostname string
	// EnableHostUserNamespace sets userns=host when users request host namespaces (pid, ipc, net),
	// are using non-namespaced capabilities (mknod, sys_time, sys_module), the pod contains a privileged container,
	// or using host path volumes.
	// This should only be enabled when the container runtime is performing user remapping AND if the
	// experimental behavior is desired.
	EnableHostUserNamespace bool
}

通过GenerateRunContainerOptions构建opts

  • envs
  • mounts
  • devices
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// GenerateRunContainerOptions generates the RunContainerOptions, which can be used by
// the container runtime to set parameters for launching a container.
func (kl *Kubelet) GenerateRunContainerOptions(pod *v1.Pod, container *v1.Container, podIP string, podIPs []string) (*kubecontainer.RunContainerOptions, func(), error) {
	
    // 1. 由container mgr获取的资源信息,实际是通过Device plugin mgr获取的插件分配资源信息等
    opts, err := kl.containerManager.GetResources(pod, container)
	
    // 内置处理,可忽略。。。
	opts.Devices = append(opts.Devices, blkVolumes...)

    // 2. 获取业务pod, container的envs配置
	envs, err := kl.makeEnvironmentVariables(pod, container, podIP, podIPs)
    // 3. 以append方式的合并操作opts.Envs
	// 注意:envs的排序:k8s资源类的envs在前,业务层:pod、container配置的的envs在后
	opts.Envs = append(opts.Envs, envs...)

	// only podIPs is sent to makeMounts, as podIPs is populated even if dual-stack feature flag is not enabled.
	mounts, cleanupAction, err := makeMounts(pod, kl.getPodDir(pod.UID), container, hostname, hostDomainName, podIPs, volumes, kl.hostutil, kl.subpather, opts.Envs)
return nil, cleanupAction, err
	}
    // 4. 以append方式的合并操作opts.Mounts
	opts.Mounts = append(opts.Mounts, mounts...)

	
}

generateContainerConfig

生成容器运行所需配置

startContainer->generateContainerConfig

从ContainerOpts到ContainerConfig:

ContainerConfig是容器最终使用的配置信息

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// generateContainerConfig generates container config for kubelet runtime v1.
func (m *kubeGenericRuntimeManager) generateContainerConfig(container *v1.Container, pod *v1.Pod, restartCount int, podIP, imageRef string, podIPs []string, nsTarget *kubecontainer.ContainerID) (*runtimeapi.ContainerConfig, func(), error) {
    
    
    // 1. GenerateRunContainerOptions 生成了了opts.Envs
	opts, cleanupAction, err := m.runtimeHelper.GenerateRunContainerOptions(pod, container, podIP, podIPs)
	
    // config 就是容器运行所需配置,部分信息依赖于RunContainerOptions
	config := &runtimeapi.ContainerConfig{
		Metadata: &runtimeapi.ContainerMetadata{
			Name:    container.Name,
			Attempt: restartCountUint32,
		},
		Image:       &runtimeapi.ImageSpec{Image: imageRef},
		Command:     command,
		Args:        args,
		WorkingDir:  container.WorkingDir,
		Labels:      newContainerLabels(container, pod),
		Annotations: newContainerAnnotations(container, pod, restartCount, opts),
		Devices:     makeDevices(opts),
		Mounts:      m.makeMounts(opts, container),
		LogPath:     containerLogsPath,
		Stdin:       container.Stdin,
		StdinOnce:   container.StdinOnce,
		Tty:         container.TTY,
	}

	// 2. 遍历opts.Envs,无内容值校验,直接写config.Envs,会出现覆盖更新

	// set environment variables
	envs := make([]*runtimeapi.KeyValue, len(opts.Envs))
	for idx := range opts.Envs {
		e := opts.Envs[idx]
		envs[idx] = &runtimeapi.KeyValue{
			Key:   e.Name,
			Value: e.Value,
		}
	}
	config.Envs = envs

	return config, cleanupAction, nil
}

说明

opts中跟envs类似处理逻辑的还有

  • opts.Devices -> config.Devices

  • opts.Mounts -> config.Mounts