Skip to content

Commit

Permalink
Created a VPE platform demo with only one fake pedestrian tracking al…
Browse files Browse the repository at this point in the history
…gorithm module.

Added checkpoint support and optimized operations for outputing to Kafka
from Spark Streaming.

Add Spark master setting to system configuration file. Add some comments
and documentation. Move the command to the MessageHandlingApplication to
the key field of a Kafka message.

Add a fake pedestrian tracker to simulate pedestrian tracking application, together with a fake meta data saving application.

Extract property solving to a separate class and apply it to all the
applications.

Some comments supplemented.

License and readme added.

Now able to send customized classes (i.e. Track) through Kafka.

Add an attribute recognition application with fake attribute recognizer
into the system.

MessageHandlingApp can now control the execution flow.

Unified designs of the sinks.

Attribute recognition application can now parallely handle tasks from
Kafka and HDFS/database, by joining two streams.

Amended several commits. Now the system can run locally and on YARN, but when running on YARN, it still cannot receive messages from Kafka.

Add comments to the Track class.

Add track ID to the Track class.

Some supplement to the attribute.

Now able to receive messages from Kafka!

Now the property file does not need to be uploaded to the HDFS. Edit it
locally, and the system can pass it to YARN automatically.

Solved problem caused by incompatibility of Spark Streaming checkpoint
and Spark broadcast.

Update README.md

Make parameters of SparkSubmit setable in the system.

However, they need further configurations in the YARN environment, and I
have not figure out what configurations are needed.

Now all applications can run concurrently on a cluster.

Update README.md

Add extra configuration advice for multi-applications running and monitoring.

Add support for modifying scheduling strategies at startup.

Add log system.

Fix a bug caused by wrongly generated serialization ID and checkpoint
directory sharing. Synthesize
logging methods. Solve HDFS saving problem.

Now logs can be printed to the terminal which starts the application.

Use createDStream instead of createDirectDStream of Kafka for robustness
and simplicity.

Reduce steps in the pedestrian tracking application.

Add support for storing images onto HDFS in JPEG format.

Use SparkLauncher instead to submit apps.

New modules should register their listening topics to TopicManager
statically now.

Enable parallel Kafka receiver. Move configurations into files.

Make meta data saving directory changable. Fix bugs.

Add comments and in-code docs.

Create native tracker interface.

add native file

Change VideoData's variable type

Add test for HDFS video decoder.

Updated the name of the project.

From VPE-Platform to LaS-VPE Platform.

Add submodule of video decoder.

Create TaskData class.

Add comments.

Use TaskData for graph-like task scheduling.

Format all the files. Add comments.

Add ReID module.

Optimize the way of adding new modules.

Add data feeding module for retrieving data from storage.

Reorganize packages of some classes.

Enable running multiple applications in one command.

Unify routine for building parallel Kafka receiving streams.

Fix bugs. Make it easy to switch between two Kafka receiving methods.

Enhance robustness of Kafka producer usage. Reduce memory cost.

Fix bugs.

Fix bugs. Remove large files for preparations of upgrading to Spark 2.0.

Update ReID API. Add extern solver for ReID.

Optimize joining operation for ReID.

Combine metadata saving app and data feeding app to reduce containers.

Make BoundingBox class static.

Modify ReID interface.

Use JSON for saving tracks. Regularize ReID & db connector interface.

Reorganize native modules.

Upload Decoder.java and Decoder_Test.java

Add function of skipping frames to the video decoder.

Correct usage of VideoDecoder in its test class.

Add Maven support.

Add function of getting linked pedestrians to the graph database
connector interface.

Add version to scripts.

Enable Maven to automatically handle native library.

Make Maven build to the bin directory. Correct scripts. Update README.

Solve bug when the lib directory does not exist when mvn package.

Disable library removing during Maven clean. Add support for windows
native libraries.

Suppress copy error during Maven building. Remove dependency on
hadoop-hdfs.

Enable packing dependencies into the JAR file.

Fix bugs.

Use simple singleton instead of broadcast.

Remove spark context settings in apps.

Remove useless variables and parameters.

Make SingletonManager manage classes using class names but not class
types.

Update in-file license. Enable update instance on creation of manager.

Change to Maven standard directory layout. Create JUnit test for
VideoDecoder.

Adapt to new Video Decoder supporting CMake.

Change exception thrown by GraphDatabaseConnector.

Add Javadoc.

Add pedestrian attribute recognizer using external solver.

Create base class for ReID feature.

Rename Track to Tracklet.

Ignore VSCode files.

Add to Javadoc of the classes using extern solvers.

Add native folder. Add annotation.

Enhance robustness of socket receiving. Solve bugs of external solvers.

Adapt to new version of Video Decoder.

Enable processing a whole dataset within one command.

Add ISEE Basic Pedestrian Tracker.

Add support for uploading extra configuration files to Spark.

Enable broadcasting configuration files to workers.

Now native modules are pushed to cluster to be made.

Solve bug that messages are too large to be sent.

Add test of JNI of ISEEBasicTracker.

Solved configuration uploading problem.

Simplify node and execution plan implementation.

Redesigned TaskData class.

Fix bugs.

Tracking can work now!
  • Loading branch information
kyu-sz committed Oct 31, 2016
1 parent 2e0dade commit 95d8425
Show file tree
Hide file tree
Showing 399 changed files with 81,734 additions and 242 deletions.
15 changes: 0 additions & 15 deletions .classpath

This file was deleted.

21 changes: 21 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Binary files #
*.so

# Temporary files #
checkpoint/
*.log
*.swp
*.lck

# Eclipse project files #
.project
.classpath
.settings/
/bin/

# VSCode files #
.vscode

# IDEA files #
*.iml
.idea
6 changes: 6 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[submodule "ISEE-Basic-Pedestrian-Tracker"]
path = src/native/ISEE-Basic-Pedestrian-Tracker
url = https://github.com/kyu-sz/ISEE-Basic-Pedestrian-Tracker.git
[submodule "Video-Decoder"]
path = src/native/Video-Decoder
url = https://github.com/kyu-sz/Video-Decoder.git
19 changes: 0 additions & 19 deletions .project

This file was deleted.

11 changes: 0 additions & 11 deletions .settings/org.eclipse.jdt.core.prefs

This file was deleted.

676 changes: 676 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

126 changes: 126 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# LaS-VPE Platform

[![AUR](https://img.shields.io/aur/license/yaourt.svg?maxAge=2592000)](LICENSE)

By Ken Yu, Yang Zhou, Da Li, Dangwei Li and Houjing Huang, under guidance of Dr. Zhang Zhang and Prof. Kaiqi Huang.

LaS-VPE Platform is a large-scale distributed video parsing and evaluation platform under the Intelligent Scene Exploration and Evaluation(iSEE) research platform of the Center for Research on Intelligent Perception and Computing(CRIPAC), Institute of Automation, Chinese Academy of Science.

The platform is powered by Spark Streaming and Kafka.

The documentation is published on [Github Pages](https://kyu-sz.github.io/LaS-VPE-Platform).

## License

LaS-VPE Platform is released under the GPL License.

## Contents
1. [Requirements](#requirements)
2. [How to run](#how-to-run)
3. [How to monitor](#how-to-monitor)
4. [How to add a new module](#how-to-add-a-new-module)
5. [How to add a new native algorithm](#how-to-add-a-new-native-algorithm)
6. [How to deploy a new version](#how-to-deploy-a-new-version)

## Requirements

1. Use Maven to build the project:

```Shell
sudo apt-get install maven
```
2. Deploy Kafka(>=0.8), HDFS(>=2.2) and YARN(>=2.2) properly on your cluster.
To enable multi-appications running concurrently, see [Job-Scheduling](https://spark.apache.org/docs/1.2.0/job-scheduling.html) and configure your environment.

## How to run

Clone the project to your cluster:

```Shell
# Make sure to clone with --recursive
git clone --recursive https://github.com/kyu-sz/LaS-VPE-Platform
```

Build and pack the system into a JAR:

```Shell
mvn compile && mvn package
```

Configure the environment and running properties in the files in [conf](conf).

Specially, modify the [cluster-env.sh](conf/cluster-env.sh) in [conf](conf) to adapt to your cluster address.

Upload the whole project to your cluster:

```Shell
./sbin/upload.sh
```

If the platform depends on native libraries, deliver them to worker nodes using [install.sh](sbin/install.sh) in [sbin](sbin) on your cluster. Note that this script requires the _HADOOP_HOME_ environment variable.

Invoke the scripts in the home directory by command like "./sbin/run-*.sh".

It is recommended to last start the [run-command-generating-app.sh](sbin/run-command-generating-app.sh), which is the debugging tool to simulate commands to the message handling application.

Welcome to read Ken Yu's Chinese [blog](http://blog.csdn.net/kyu_115s/article/details/51887223) on experiences gained during the development.

## How to monitor

To briefly monitor, some informations are printed to the console that starts each module. However, to use this function, you must have the name of the host you start the module registered to each task nodes.

To fully monitor your Spark application, you might need to access the log files in the slave nodes. However, if your application runs on a cluster without desktop, and you connect remotely to the cluster, you might not be able to access the web pages loaded from the slave nodes.

To solve this problem, first add the ip address of the slave nodes to the /etc/hosts in the master node. Make sure the master node can access the pages on slave nodes by terminal browsers like w3m or lynx. In Ubuntu, they can be installed by ```sudo apt-get install w3m``` or ```sudo apt-get install lynx```.

Then, configure your master node to be a proxy server using Squid. Tutors can be found in websites like [Help-Ubuntu-Squid](https://help.ubuntu.com/community/Squid).

Finally, configure your browser to use the proxy provided by you master node. Then it would be able to access pages on slave nodes.

In Firefox, it is recommended to use the AutoProxy plugin for enabling proxy. Beside obvious configurations, you need to first access *about:config*, then set *network.proxy.socks_remote_dns* as *true*.

## Basic conceptions in the project

_Application_: Same as that in YARN.

_Stream_: A flow of DStreams. Each stream may take in more than one input Kafka topic, but output at most one kind of output. An _Application_ may contains multiple streams.

_Node_: An execution of a _Stream_. A pack of input data and parameters are input into the stream.

_ExecutionPlan_: A flow graph of _Node_.

## How to add a new module

A new module may be based on some algorithms whose codes are written in other languages, so you first need to wrap them into JAVA using JNI.

See an application such as [PedestrianTrackingApp](src/main/java/org/cripac/isee/pedestrian/tracking/PedestrianTracker.java), etc. for example of how to write an application module. Write your own module then add it to this project. Also register its class name to the [AppManager](src/main/java/org/cripac/isee/vpe/ctrl/AppManager.java) by adding a line in the static block, similar to other lines.

You may also need to extend the [CommandGeneratingApp](src/main/java/org/cripac/isee/vpe/debug/CommandGeneratingApp.java), [MessageHandlingApp](src/main/java/org/cripac/isee/vpe/ctrl/MessageHandlingApp.java) and [DataManagingApp](src/main/java/org/cripac/isee/vpe/data/DataManagingApp.java) for support of the module.

## How to add a new native algorithm

You may want to run algorithms written in other languages like C/C++ on this platform. There are already examples of them: see [Video-Decoder](Video-Decoder) and [ISEE-Basic=Pedestrian-Tracker](ISEE-Basic=Pedestrian-Tracker).

First of all, you should wrap your algorithm with JNI. It is recommended to implement this in another GitHub repository, and import it as a submodule.

Then, add the corresponding Java class to the platform. Be careful to put it in a suitable package.

Finally, build your algorithm project, and copy the resulting shared JNI library and those it depends on into the [library folder](lib/linux) directory.

To enable auto building and cleaning together with Maven, it is recommended to use CMake to build your project. Then edit the [native library building script](sbin/build-native-libs.sh) and [native library cleaning script](sbin/clean-native-libs.sh), following the examples in them.

If the new algorithm requires extra configuration files, remember to register them to the [CongigFileManager](src/main/java/org/cripac/isee/vpe/ctrl/ConfigFileManager.java).

## How to deploy a new version

Pack the new or modified project into a new JAR.

If you have updated the version number, remember to check the [system property file](conf/system.properties), where there is an option named "vpe.platform.jar" specifying the name of the JAR file to upload, and it should be the same as that of your newly built JAR file.

Upload the JAR to your cluster with your customized [uploading script](sbin/upload.sh).

After that, kill the particular old application and run the new one. It need not restart other modules! Now your module runs together with the original modules.

However, for modified modules, if you have run them once with checkpoint enabled, you should clean the old checkpoint directory or use a new one, so that the system can run new contexts rather than recovering old ones.

Sometimes you may also need to clean Kafka and Zookeeper logs, which are defaultly stored in the /tmp folder in each executor.
1 change: 0 additions & 1 deletion bin/.gitignore

This file was deleted.

7 changes: 7 additions & 0 deletions conf/cluster-env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/usr/bin/env bash
export DRIVER_USER="labadmin"
export DRIVER_NODE="rman-nod1"
export VPE_FOLDER="/home/labadmin/las-vpe-platform"

export HADOOP_HOME=${HADOOP_HOME}
export SLAVE_HADOOP_HOME=${HADOOP_HOME}
58 changes: 58 additions & 0 deletions conf/log4j.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Set everything to be logged to the console
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %-5p %c{1}: %m%n

log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=/tmp/spark-streaming.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %-5p %c{1}: %m%n

# By default, everything goes to console and file
log4j.rootLogger=INFO, console, RollingAppender

# The noiser spark logs got to file only
log4f.logger.spark.storage=INFO, RollingAppender
log4j.additivity.spark.storage=false
log4f.logger.spark.scheduler=INFO, RollingAppender
log4j.additivity.spark.scheduler=false
log4f.logger.spark.CacheTracker=INFO, RollingAppender
log4j.additivity.spark.CacheTracker=false
log4f.logger.spark.CacheTrackerActor=INFO, RollingAppender
log4j.additivity.spark.CacheTrackerActor=false
log4f.logger.spark.MapOutputTracker=INFO, RollingAppender
log4j.additivity.spark.MapOutputTracker=false
log4f.logger.spark.MapOutputTrackerActor=INFO, RollingAppender
log4j.additivity.spark.MapOutputTrackerActor=false

# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=WARN
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=WARN
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR

# SPARK-9183: Settings to avoid annoying messages when looking up nonexistent UDFs in SparkSQL with Hive support
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
121 changes: 121 additions & 0 deletions conf/pedestrian-tracking/isee-basic/CAM01_0.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
[CAM]
CAMarea=4F
CAMID=1001000
CAMname=CAM01
FPS=13
HomographyMatrix_11=0.00
HomographyMatrix_12=0.00
HomographyMatrix_13=0.00
HomographyMatrix_14=0.00
HomographyMatrix_21=0.00
HomographyMatrix_22=0.00
HomographyMatrix_23=0.00
HomographyMatrix_24=0.00
HomographyMatrix_31=1.23
HomographyMatrix_32=2.35
HomographyMatrix_33=8.76
HomographyMatrix_34=2.78
HomographyMatrix_41=3.74
HomographyMatrix_42=5.12
HomographyMatrix_43=6.89
HomographyMatrix_44=1.32
[F10]
flag=1
nTripwire=1
ROI1=1001001
ROI1_nPts=2
ROI1_point1.x=122
ROI1_point1.y=357
ROI1_point2.x=639
ROI1_point2.y=369
[F20]
flag=1
dBgThresh=0.70
dFactor=2.52
iRefDistance=10
iDownScale=2
iInitFrame=178
iLostFrame=10
dInitLearnRate=0.01
dInitMean=127.50
dInitStd=18.00
dInitWeight=0.05
dDisWithoutPredict=90.00
dDisWithPredict=80.00
dMinObjSize=10000.00
dMaxObjSize=368640.00
dMinStd=17.00
dUpdateLearnRate=0.001
iPredictWinLen=5
iStartFrame=10
dSuddenRatio=0.90
dObjSizeLowRate=4.00
dObjSizeUpRate=4.00
dMaxVelocity=1000.00
bBrectDis=1
bDesOut=1
bRegionDis=1
bRegionOut=1
bSceneOut=1
bTimeOut=1
bTrajDis=1
bTypeOut=1
dNearbyPosition=0.00
dMediumPosition=0.00
dFarawayPosition=0.00
dMediumTargetSize=0.00
dNearbyTargetSize=0.00
dFarawayTargetSize=0.00
nAlarmROI=1
nNoUse=0
ROI1=1001001
ROI1_alarmType=0
ROI1_alarmLevel=0
ROI1_iTripwireDirection=3
ROI1_iApproachingFrameCount=1
ROI1_iPassedFrameCount=1
ROI1_nPts=4
ROI1_dMinTargetSize=0.00
ROI1_strDescription=
ROI1_point1.x=134
ROI1_point1.y=336
ROI1_point2.x=25
ROI1_point2.y=667
ROI1_point3.x=740
ROI1_point3.y=656
ROI1_point4.x=622
ROI1_point4.y=332
[F30]
flag=1
nROI=1
ROI1=1001001
ROI1_nPts=4
ROI1_point1.x=134
ROI1_point1.y=336
ROI1_point2.x=25
ROI1_point2.y=667
ROI1_point3.x=740
ROI1_point3.y=656
ROI1_point4.x=622
ROI1_point4.y=332
[F40]
flag=1
BackImagePath=test_data/img_data/background1.jpg
DownSampleWidth=640
DownSampleHeight=320
IfShadow=0
WeightNum_y=4
WeightCoefficient=1.12
NormalizationCoefficient=9.00
nROI=1
ROI1=1001001
ROI1_nPts=4
ROI1_point1.x=134
ROI1_point1.y=336
ROI1_point2.x=25
ROI1_point2.y=667
ROI1_point3.x=740
ROI1_point3.y=656
ROI1_point4.x=622
ROI1_point4.y=332
[F50]
Loading

0 comments on commit 95d8425

Please sign in to comment.