Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

${launchId} is not being replaced (sometimes) #495

Open
cgr71ii opened this issue Aug 31, 2022 · 1 comment
Open

${launchId} is not being replaced (sometimes) #495

cgr71ii opened this issue Aug 31, 2022 · 1 comment
Labels

Comments

@cgr71ii
Copy link

cgr71ii commented Aug 31, 2022

Hi,

I've observed in the code that the value "${launchId}" is expected to be replaced with a value I'm not sure what is. Anyway, I'm trying to understand the configuration file and I found that the disposition chain uses for the directory directive the value "${launchId}". If I'm not wrong, this value should create a directory with this value replaced. What it happens instead, not always but it happens, is that in the job directory there is a directory with the literal name "${launchId}". Is this expected? I've observed that there are other directives which uses this value, but I haven't checked out if this affects to these directives as well.

<!-- DISPOSITION CHAIN -->
<bean id="warcWriter" class="org.archive.modules.writer.WARCWriterChainProcessor">
  <!-- ... -->
  <!-- <property name="directory" value="${launchId}" /> -->
  <!-- ... -->
</bean>

I'm using the last version of Heritrix (last commit from master).

I think that the times this happened to me has been when the issue described in this comment happened:

ls -la /home/cgarcia/Documentos/heritrix3/build_1661889017/heritrix-3.4.0-SNAPSHOT/jobs/clashroyale/\$\{launchId\}/

# total 12
# drwxrwxr-x 3 cgarcia cgarcia 4096 ago 31 11:31 .
# drwxrwxr-x 8 cgarcia cgarcia 4096 ago 31 19:16 ..
# drwxrwxr-x 2 cgarcia cgarcia 4096 ago 31 11:31 reports

ls -la /home/cgarcia/Documentos/heritrix3/build_1661889017/heritrix-3.4.0-SNAPSHOT/jobs/clashroyale/\$\{launchId\}/reports/

# drwxrwxr-x 2 cgarcia cgarcia 4096 ago 31 11:31 .
# drwxrwxr-x 3 cgarcia cgarcia 4096 ago 31 11:31 ..
# -rw-rw-r-- 1 cgarcia cgarcia  280 ago 31 11:31 crawl-report.txt
# -rw-rw-r-- 1 cgarcia cgarcia    0 ago 31 11:31 seeds-report.txt
# -rw-rw-r-- 1 cgarcia cgarcia   13 ago 31 11:31 threads-report.txt
@cgr71ii
Copy link
Author

cgr71ii commented Aug 31, 2022

It seems that the part of the configuration that is being affected by this issue is:

<bean id="statisticsTracker" class="org.archive.crawler.reporting.StatisticsTracker" autowire="byName">
  <!-- ... -->
  <!-- <property name="reportsDir" value="${launchId}/reports" /> -->
  <!-- ... -->
</bean>

@ato ato added the bug label Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants