Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc][guide/installation/pseudo-cluster.md] Documentation improvement #15310

Closed
3 tasks done
cncws opened this issue Dec 12, 2023 · 2 comments
Closed
3 tasks done

[Doc][guide/installation/pseudo-cluster.md] Documentation improvement #15310

cncws opened this issue Dec 12, 2023 · 2 comments
Assignees
Milestone

Comments

@cncws
Copy link
Contributor

cncws commented Dec 12, 2023

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

TL;DR

guide/installation/pseudo-cluster.md

replace this

# Database related configuration, set database type, username and password
export DATABASE=${DATABASE:-postgresql}
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:postgresql://127.0.0.1:5432/dolphinscheduler"
export SPRING_DATASOURCE_USERNAME={user}
export SPRING_DATASOURCE_PASSWORD={password}

to

# Database related configuration, set database type, username and password
export DATABASE=${DATABASE:-postgresql}
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:postgresql://127.0.0.1:5432/dolphinscheduler?stringtype=unspecified"
export SPRING_DATASOURCE_USERNAME={user}
# empty password is not recommended or some task will be failed
export SPRING_DATASOURCE_PASSWORD={password}

Why

I deployed DS in the centos virtual machine and encountered problems because of the database related configuration. Here is my settings:

export SPRING_DATASOURCE_URL="jdbc:postgresql://localhost:5432/dolphinscheduler"
export SPRING_DATASOURCE_USERNAME="postgres"
export SPRING_DATASOURCE_PASSWORD=""

The password is empty since I only use it locally. However this will cause a problem: the DS cannot run Data Quality tasks.

When run a Data Quality task, master server print the log like Task <task_name> is submitted to priority queue error. I read source code and got that DS link to the database above to find the data source used in task. And it failed on connection due to NPE. The relevant code is located at dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/runner/task/BaseTaskProcessor.java:

public DataSource getDefaultDataSource() {
        DataSource dataSource = new DataSource();

        HikariDataSource hikariDataSource = (HikariDataSource) defaultDataSource;
        dataSource.setUserName(hikariDataSource.getUsername());
        JdbcInfo jdbcInfo = JdbcUrlParser.getJdbcInfo(hikariDataSource.getJdbcUrl());
        if (jdbcInfo != null) {
            Properties properties = new Properties();
            properties.setProperty(USER, hikariDataSource.getUsername());
            properties.setProperty(PASSWORD, hikariDataSource.getPassword());    // this line
            properties.setProperty(DATABASE, jdbcInfo.getDatabase());
            properties.setProperty(ADDRESS, jdbcInfo.getAddress());
            properties.setProperty(OTHER, jdbcInfo.getParams());
            properties.setProperty(JDBC_URL, jdbcInfo.getAddress() + SINGLE_SLASH + jdbcInfo.getDatabase());
            dataSource.setType(DbType.of(JdbcUrlParser.getDbType(jdbcInfo.getDriverName()).getCode()));
            dataSource.setConnectionParams(JSONUtils.toJsonString(properties));
        }

        return dataSource;
    }

The solution is quite simple, just use a datasource with password. So I want to add a hint to the doc to help others avoid this problem. It would be better to make the Data Quality task compatible with empty password.

Another suggestion: many tables used timestamp columns but the DS insert a string value like "2023-12-12 12:00:00". This would cause error like this: column "data_time" is of type timestamp without time zone but expression is of type character varying. We can append ?stringtype=unspecified to the SPRING_DATASOURCE_URL to avoid this problem.

Documentation Links

https://dolphinscheduler.apache.org/zh-cn/docs/3.2.0/guide/installation/pseudo-cluster

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Copy link

github-actions bot commented Mar 1, 2024

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Mar 1, 2024
Copy link

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants