业务pod的envs如果跟业务平台的预置envs相同,则会引导致业务平台的保留envs被覆盖掉。
背景
场景:如果k8s的插件,如:NVIDIA-GPU插件对业务pod设置了环境变量信息envs(不是插件pod本身的envs),但用户创建业务pod中container也设置了同名envs,则会覆盖掉插件对业务pod设置的envs。
说明:业务pod的envs如果跟业务平台的预置envs相同,则会引导致业务平台的保留envs被覆盖掉。
k8s流程分析
梳理了k8s1.20.5版本的容器相关流程,大概分析下:
startContainer
启动容器
1
2
3
4
5
6
7
| // startContainer starts a container and returns a message indicates why it is failed on error.
// It starts the container through the following steps:
// * pull the image
// * create the container
// * start the container
// * run the post start lifecycle hooks (if applicable)
func (m *kubeGenericRuntimeManager) startContainer
|
RunContainerOptions
环境变量envs属于RunContainerOptions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
| // RunContainerOptions specify the options which are necessary for running containers
type RunContainerOptions struct {
// The environment variables list.
// 环境变量
Envs []EnvVar
// The mounts for the containers.
// 挂载配置
Mounts []Mount
// The host devices mapped into the containers.
// 资源设备
Devices []DeviceInfo
// The annotations for the container
// These annotations are generated by other components (i.e.,
// not users). Currently, only device plugins populate the annotations.
Annotations []Annotation
// If the container has specified the TerminationMessagePath, then
// this directory will be used to create and mount the log file to
// container.TerminationMessagePath
PodContainerDir string
// The type of container rootfs
ReadOnly bool
// hostname for pod containers
Hostname string
// EnableHostUserNamespace sets userns=host when users request host namespaces (pid, ipc, net),
// are using non-namespaced capabilities (mknod, sys_time, sys_module), the pod contains a privileged container,
// or using host path volumes.
// This should only be enabled when the container runtime is performing user remapping AND if the
// experimental behavior is desired.
EnableHostUserNamespace bool
}
|
通过GenerateRunContainerOptions构建opts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| // GenerateRunContainerOptions generates the RunContainerOptions, which can be used by
// the container runtime to set parameters for launching a container.
func (kl *Kubelet) GenerateRunContainerOptions(pod *v1.Pod, container *v1.Container, podIP string, podIPs []string) (*kubecontainer.RunContainerOptions, func(), error) {
// 1. 由container mgr获取的资源信息,实际是通过Device plugin mgr获取的插件分配资源信息等
opts, err := kl.containerManager.GetResources(pod, container)
// 内置处理,可忽略。。。
opts.Devices = append(opts.Devices, blkVolumes...)
// 2. 获取业务pod, container的envs配置
envs, err := kl.makeEnvironmentVariables(pod, container, podIP, podIPs)
// 3. 以append方式的合并操作opts.Envs
// 注意:envs的排序:k8s资源类的envs在前,业务层:pod、container配置的的envs在后
opts.Envs = append(opts.Envs, envs...)
// only podIPs is sent to makeMounts, as podIPs is populated even if dual-stack feature flag is not enabled.
mounts, cleanupAction, err := makeMounts(pod, kl.getPodDir(pod.UID), container, hostname, hostDomainName, podIPs, volumes, kl.hostutil, kl.subpather, opts.Envs)
return nil, cleanupAction, err
}
// 4. 以append方式的合并操作opts.Mounts
opts.Mounts = append(opts.Mounts, mounts...)
}
|
generateContainerConfig
生成容器运行所需配置
startContainer->generateContainerConfig
从ContainerOpts到ContainerConfig:
ContainerConfig是容器最终使用的配置信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
| // generateContainerConfig generates container config for kubelet runtime v1.
func (m *kubeGenericRuntimeManager) generateContainerConfig(container *v1.Container, pod *v1.Pod, restartCount int, podIP, imageRef string, podIPs []string, nsTarget *kubecontainer.ContainerID) (*runtimeapi.ContainerConfig, func(), error) {
// 1. GenerateRunContainerOptions 生成了了opts.Envs
opts, cleanupAction, err := m.runtimeHelper.GenerateRunContainerOptions(pod, container, podIP, podIPs)
// config 就是容器运行所需配置,部分信息依赖于RunContainerOptions
config := &runtimeapi.ContainerConfig{
Metadata: &runtimeapi.ContainerMetadata{
Name: container.Name,
Attempt: restartCountUint32,
},
Image: &runtimeapi.ImageSpec{Image: imageRef},
Command: command,
Args: args,
WorkingDir: container.WorkingDir,
Labels: newContainerLabels(container, pod),
Annotations: newContainerAnnotations(container, pod, restartCount, opts),
Devices: makeDevices(opts),
Mounts: m.makeMounts(opts, container),
LogPath: containerLogsPath,
Stdin: container.Stdin,
StdinOnce: container.StdinOnce,
Tty: container.TTY,
}
// 2. 遍历opts.Envs,无内容值校验,直接写config.Envs,会出现覆盖更新
// set environment variables
envs := make([]*runtimeapi.KeyValue, len(opts.Envs))
for idx := range opts.Envs {
e := opts.Envs[idx]
envs[idx] = &runtimeapi.KeyValue{
Key: e.Name,
Value: e.Value,
}
}
config.Envs = envs
return config, cleanupAction, nil
}
|
说明
opts中跟envs类似处理逻辑的还有