Skip to content

Commit

Permalink
Support filter restrict network interface (#14638)
Browse files Browse the repository at this point in the history
  • Loading branch information
ruanwenjun authored Jul 26, 2023
1 parent 5a550dd commit 2b99451
Show file tree
Hide file tree
Showing 7 changed files with 128 additions and 80 deletions.
63 changes: 32 additions & 31 deletions docs/docs/en/architecture/configuration.md

Large diffs are not rendered by default.

63 changes: 32 additions & 31 deletions docs/docs/zh/architecture/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,37 +200,38 @@ common.properties配置文件目前主要是配置hadoop/s3/yarn/applicationId

默认配置如下:

| 参数 | 默认值 | 描述 |
|--|--|--|
|data.basedir.path | /tmp/dolphinscheduler | 本地工作目录,用于存放临时文件|
|resource.storage.type | NONE | 资源文件存储类型: HDFS,S3,OSS,GCS,ABS,NONE|
|resource.upload.path | /dolphinscheduler | 资源文件存储路径|
|aws.access.key.id | minioadmin | S3 access key|
|aws.secret.access.key | minioadmin | S3 secret access key|
|aws.region | us-east-1 | S3 区域|
|aws.s3.endpoint | http://minio:9000 | S3 endpoint地址|
|hdfs.root.user | hdfs | 如果存储类型为HDFS,需要配置拥有对应操作权限的用户|
|fs.defaultFS | hdfs://mycluster:8020 | 请求地址如果resource.storage.type=S3,该值类似为: s3a://dolphinscheduler. 如果resource.storage.type=HDFS, 如果 hadoop 配置了 HA,需要复制core-site.xml 和 hdfs-site.xml 文件到conf目录|
|hadoop.security.authentication.startup.state | false | hadoop是否开启kerberos权限|
|java.security.krb5.conf.path | /opt/krb5.conf | kerberos配置目录|
|login.user.keytab.username | [email protected] | kerberos登录用户|
|login.user.keytab.path | /opt/hdfs.headless.keytab | kerberos登录用户keytab|
|kerberos.expire.time | 2 | kerberos过期时间,整数,单位为小时|
|yarn.resourcemanager.ha.rm.ids | 192.168.xx.xx,192.168.xx.xx | yarn resourcemanager 地址, 如果resourcemanager开启了HA, 输入HA的IP地址(以逗号分隔),如果resourcemanager为单节点, 该值为空即可|
|yarn.application.status.address | http://ds1:8088/ws/v1/cluster/apps/%s | 如果resourcemanager开启了HA或者没有使用resourcemanager,保持默认值即可. 如果resourcemanager为单节点,你需要将ds1 配置为resourcemanager对应的hostname|
|development.state | false | 是否处于开发模式|
|dolphin.scheduler.network.interface.preferred | NONE | 网卡名称|
|dolphin.scheduler.network.priority.strategy | default | ip获取策略 default优先获取内网|
|resource.manager.httpaddress.port | 8088 | resource manager的端口|
|yarn.job.history.status.address | http://ds1:19888/ws/v1/history/mapreduce/jobs/%s | yarn的作业历史状态URL|
|datasource.encryption.enable | false | 是否启用datasource 加密|
|datasource.encryption.salt | !@#$%^&* | datasource加密使用的salt|
|data-quality.jar.name | dolphinscheduler-data-quality-dev-SNAPSHOT.jar | 配置数据质量使用的jar包|
|support.hive.oneSession | false | 设置hive SQL是否在同一个session中执行|
|sudo.enable | true | 是否开启sudo|
|alert.rpc.port | 50052 | Alert Server的RPC端口|
|zeppelin.rest.url | http://localhost:8080 | zeppelin RESTful API 接口地址|
|appId.collect | log | 收集applicationId方式, 如果用aop方法,将配置log替换为aop,并将`bin/env/dolphinscheduler_env.sh`自动收集applicationId相关环境变量配置的注释取消掉,注意:aop不支持远程主机提交yarn作业的方式比如Beeline客户端提交,且如果用户环境覆盖了dolphinscheduler_env.sh收集applicationId相关环境变量配置,aop方法会失效|
| 参数 | 默认值 | 描述 |
|-----------------------------------------------|--|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| data.basedir.path | /tmp/dolphinscheduler | 本地工作目录,用于存放临时文件 |
| resource.storage.type | NONE | 资源文件存储类型: HDFS,S3,OSS,GCS,ABS,NONE |
| resource.upload.path | /dolphinscheduler | 资源文件存储路径 |
| aws.access.key.id | minioadmin | S3 access key |
| aws.secret.access.key | minioadmin | S3 secret access key |
| aws.region | us-east-1 | S3 区域 |
| aws.s3.endpoint | http://minio:9000 | S3 endpoint地址 |
| hdfs.root.user | hdfs | 如果存储类型为HDFS,需要配置拥有对应操作权限的用户 |
| fs.defaultFS | hdfs://mycluster:8020 | 请求地址如果resource.storage.type=S3,该值类似为: s3a://dolphinscheduler. 如果resource.storage.type=HDFS, 如果 hadoop 配置了 HA,需要复制core-site.xml 和 hdfs-site.xml 文件到conf目录 |
| hadoop.security.authentication.startup.state | false | hadoop是否开启kerberos权限 |
| java.security.krb5.conf.path | /opt/krb5.conf | kerberos配置目录 |
| login.user.keytab.username | [email protected] | kerberos登录用户 |
| login.user.keytab.path | /opt/hdfs.headless.keytab | kerberos登录用户keytab |
| kerberos.expire.time | 2 | kerberos过期时间,整数,单位为小时 |
| yarn.resourcemanager.ha.rm.ids | 192.168.xx.xx,192.168.xx.xx | yarn resourcemanager 地址, 如果resourcemanager开启了HA, 输入HA的IP地址(以逗号分隔),如果resourcemanager为单节点, 该值为空即可 |
| yarn.application.status.address | http://ds1:8088/ws/v1/cluster/apps/%s | 如果resourcemanager开启了HA或者没有使用resourcemanager,保持默认值即可. 如果resourcemanager为单节点,你需要将ds1 配置为resourcemanager对应的hostname |
| development.state | false | 是否处于开发模式 |
| dolphin.scheduler.network.interface.preferred | NONE | 将会被使用的网卡名称 |
| dolphin.scheduler.network.interface.restrict | NONE | 禁止使用的网卡名称 |
| dolphin.scheduler.network.priority.strategy | default | ip获取策略 default优先获取内网 |
| resource.manager.httpaddress.port | 8088 | resource manager的端口 |
| yarn.job.history.status.address | http://ds1:19888/ws/v1/history/mapreduce/jobs/%s | yarn的作业历史状态URL |
| datasource.encryption.enable | false | 是否启用datasource 加密 |
| datasource.encryption.salt | !@#$%^&* | datasource加密使用的salt |
| data-quality.jar.name | dolphinscheduler-data-quality-dev-SNAPSHOT.jar | 配置数据质量使用的jar包 |
| support.hive.oneSession | false | 设置hive SQL是否在同一个session中执行 |
| sudo.enable | true | 是否开启sudo |
| alert.rpc.port | 50052 | Alert Server的RPC端口 |
| zeppelin.rest.url | http://localhost:8080 | zeppelin RESTful API 接口地址 |
| appId.collect | log | 收集applicationId方式, 如果用aop方法,将配置log替换为aop,并将`bin/env/dolphinscheduler_env.sh`自动收集applicationId相关环境变量配置的注释取消掉,注意:aop不支持远程主机提交yarn作业的方式比如Beeline客户端提交,且如果用户环境覆盖了dolphinscheduler_env.sh收集applicationId相关环境变量配置,aop方法会失效 |

## Api-server相关配置

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -661,18 +661,6 @@ private Constants() {
*/
public static final String SYSTEM_LINE_SEPARATOR = System.getProperty("line.separator");

/**
* network interface preferred
*/
public static final String DOLPHIN_SCHEDULER_NETWORK_INTERFACE_PREFERRED =
"dolphin.scheduler.network.interface.preferred";

/**
* network IP gets priority, default inner outer
*/
public static final String DOLPHIN_SCHEDULER_NETWORK_PRIORITY_STRATEGY =
"dolphin.scheduler.network.priority.strategy";

/**
* exec shell scripts
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@

package org.apache.dolphinscheduler.common.utils;

import org.apache.dolphinscheduler.common.constants.Constants;

import org.apache.commons.collections4.CollectionUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.http.conn.util.InetAddressUtils;
Expand All @@ -31,20 +29,32 @@
import java.net.SocketException;
import java.net.UnknownHostException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Enumeration;
import java.util.LinkedList;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;

import lombok.extern.slf4j.Slf4j;

import com.google.common.collect.Sets;

/**
* NetUtils
*/
@Slf4j
public class NetUtils {

private static final String DOLPHIN_SCHEDULER_NETWORK_INTERFACE_PREFERRED =
"dolphin.scheduler.network.interface.preferred";
private static final String DOLPHIN_SCHEDULER_NETWORK_INTERFACE_RESTRICT =
"dolphin.scheduler.network.interface.restrict";

private static final String DOLPHIN_SCHEDULER_NETWORK_PRIORITY_STRATEGY =
"dolphin.scheduler.network.priority.strategy";

private static final String NETWORK_PRIORITY_DEFAULT = "default";
private static final String NETWORK_PRIORITY_INNER = "inner";
private static final String NETWORK_PRIORITY_OUTER = "outer";
Expand Down Expand Up @@ -214,6 +224,13 @@ private static List<NetworkInterface> findSuitableNetworkInterface() {
}
}

Set<String> restrictNetworkInterfaceName = restrictNetworkInterfaceName();
if (CollectionUtils.isNotEmpty(restrictNetworkInterfaceName)) {
validNetworkInterfaces = validNetworkInterfaces.stream()
.filter(validNetworkInterface -> !restrictNetworkInterfaceName
.contains(validNetworkInterface.getDisplayName()))
.collect(Collectors.toList());
}
return filterByNetworkPriority(validNetworkInterfaces);
}

Expand Down Expand Up @@ -297,16 +314,24 @@ private static List<NetworkInterface> getAllNetworkInterfaces() throws SocketExc
}

private static String specifyNetworkInterfaceName() {
return PropertyUtils.getString(
Constants.DOLPHIN_SCHEDULER_NETWORK_INTERFACE_PREFERRED,
System.getProperty(Constants.DOLPHIN_SCHEDULER_NETWORK_INTERFACE_PREFERRED));
return PropertyUtils.getString(DOLPHIN_SCHEDULER_NETWORK_INTERFACE_PREFERRED,
System.getProperty(DOLPHIN_SCHEDULER_NETWORK_INTERFACE_PREFERRED));
}

private static Set<String> restrictNetworkInterfaceName() {
return PropertyUtils.getSet(DOLPHIN_SCHEDULER_NETWORK_INTERFACE_RESTRICT, value -> {
if (StringUtils.isEmpty(value)) {
return Collections.emptySet();
}
return Arrays.stream(value.split(",")).map(String::trim).collect(Collectors.toSet());
}, Sets.newHashSet("docker0"));
}

private static List<NetworkInterface> filterByNetworkPriority(List<NetworkInterface> validNetworkInterfaces) {
if (CollectionUtils.isEmpty(validNetworkInterfaces)) {
return Collections.emptyList();
}
String networkPriority = PropertyUtils.getString(Constants.DOLPHIN_SCHEDULER_NETWORK_PRIORITY_STRATEGY,
String networkPriority = PropertyUtils.getString(DOLPHIN_SCHEDULER_NETWORK_PRIORITY_STRATEGY,
NETWORK_PRIORITY_DEFAULT);
switch (networkPriority) {
case NETWORK_PRIORITY_DEFAULT:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,15 @@
import org.apache.dolphinscheduler.common.enums.ResUploadType;

import org.apache.commons.collections4.CollectionUtils;
import org.apache.commons.lang3.StringUtils;

import java.io.IOException;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;
import java.util.Set;
import java.util.function.Function;

import lombok.extern.slf4j.Slf4j;

Expand Down Expand Up @@ -294,4 +296,12 @@ public static Map<String, String> getPropertiesByPrefix(String prefix) {
});
return propertiesMap;
}

public static <T> Set<T> getSet(String key, Function<String, Set<T>> transformFunction, Set<T> defaultValue) {
String value = (String) properties.get(key);
if (StringUtils.isEmpty(value)) {
return defaultValue;
}
return transformFunction.apply(value);
}
}
3 changes: 3 additions & 0 deletions dolphinscheduler-common/src/main/resources/common.properties
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ sudo.enable=true
# network interface preferred like eth0, default: empty
#dolphin.scheduler.network.interface.preferred=

# network interface restrict like docker0,docker1 , default: docker0
dolphin.scheduler.network.interface.restrict=docker0

# network IP gets priority, default: inner outer
#dolphin.scheduler.network.priority.strategy=default

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,18 @@

import org.apache.dolphinscheduler.common.constants.Constants;

import org.apache.commons.lang3.StringUtils;

import java.util.Arrays;
import java.util.Collections;
import java.util.Set;
import java.util.stream.Collectors;

import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;

import com.google.common.collect.Sets;

public class PropertyUtilsTest {

@Test
Expand All @@ -33,4 +42,15 @@ public void getString() {
public void getResUploadStartupState() {
Assertions.assertTrue(PropertyUtils.getResUploadStartupState());
}

@Test
public void getSet() {
Set<String> networkInterface = PropertyUtils.getSet("networkInterface", value -> {
if (StringUtils.isEmpty(value)) {
return Collections.emptySet();
}
return Arrays.stream(value.split(",")).map(String::trim).collect(Collectors.toSet());
}, Sets.newHashSet("docker0"));
Assertions.assertEquals(Sets.newHashSet("docker0"), networkInterface);
}
}

0 comments on commit 2b99451

Please sign in to comment.