UnROOT.jl is a reader for the CERN ROOT file format written entirely in Julia, without any dependence on ROOT or Python.
- Download the latest Julia release
- Open up Julia REPL (hit
]
once to enter Pkg mode, hit backspace to exit it)
julia>]
(v1.8) pkg> add UnROOT
Quick Start (see docs for more)
julia> using UnROOT
julia> f = ROOTFile("test/samples/NanoAODv5_sample.root")
ROOTFile with 2 entries and 21 streamers.
test/samples/NanoAODv5_sample.root
└─ Events
├─ "run"
├─ "luminosityBlock"
├─ "event"
├─ "HTXS_Higgs_pt"
├─ "HTXS_Higgs_y"
└─ "⋮"
julia> mytree = LazyTree(f, "Events", ["Electron_dxy", "nMuon", r"Muon_(pt|eta)$"])
Row │ Electron_dxy nMuon Muon_eta Muon_pt
│ Vector{Float32} UInt32 Vector{Float32} Vector{Float32}
─────┼───────────────────────────────────────────────────────────
1 │ [0.000371] 0 [] []
2 │ [-0.00982] 2 [0.53, 0.229] [19.9, 15.3]
3 │ [] 0 [] []
4 │ [-0.00157] 0 [] []
⋮ │ ⋮ ⋮ ⋮ ⋮
You can iterate through a LazyTree
:
julia> for event in mytree
@show event.Electron_dxy
break
end
event.Electron_dxy = Float32[0.00037050247]
julia> Threads.@threads for event in mytree # multi-threading
...
end
Only one basket per branch will be cached so you don't have to worry about running out of RAM.
At the same time, event
inside the for-loop is not materialized until a field is accessed. This means you should avoid double-access,
see performance tips
XRootD is also supported, depending on the protocol:
- the "url" has to start with
http://
orhttps://
: - (1.6+ only) or the "url" has to start with
root://
and have another//
to separate server and file path
julia> r = @time ROOTFile("https://scikit-hep.org/uproot3/examples/Zmumu.root")
0.034877 seconds (5.13 k allocations: 533.125 KiB)
ROOTFile with 1 entry and 18 streamers.
julia> r = ROOTFile("root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleMuParked.root")
ROOTFile with 1 entry and 19 streamers.
We provide an experimental interface for hooking up UnROOT with your custom types
that only takes 2 steps, as explained in the docs.
As a show case for this functionality, the TLorentzVector
support in UnROOT is implemented
with the said plug-in system.
- Use Github issues for any bug reporting or feature request; feel free to make PRs, bug fixing, feature tuning, quality of life, docs, examples etc.
- See
CONTRIBUTING.md
for more information and recommended workflows in contributing to this package.
- Parsing the file header
- Read the
TKey
s of the top level dictionary - Reading the available trees
- Reading the available streamers
- Reading a simple dataset with primitive streamers
- Reading of raw basket bytes for debugging
- Automatically generate streamer logic
- Prettier
show
forLazy*
s - Clean up
Cursor
use - Reading
TNtuple
#27 - Reading histograms (
TH1D
,TH1F
,TH2D
,TH2F
, etc.) #48 - Clean up the
readtype
,unpack
,stream!
andreadobjany
construct - Refactor the code and add more docs
- Class name detection of sub-branches
- High-level histogram interface
Special thanks to Jim Pivarski (@jpivarski) from the Scikit-HEP project, who is the main author of uproot, a native Python library to read and write ROOT files, which was and is a great source of inspiration and information for reverse engineering the ROOT binary structures.
Some additional debug output:
julia> using UnROOT
julia> f = ROOTFile("test/samples/tree_with_histos.root")
Compressed stream at 1509
ROOTFile("test/samples/tree_with_histos.root") with 1 entry and 4 streamers.
julia> keys(f)
1-element Array{String,1}:
"t1"
julia> keys(f["t1"])
Compressed datastream of 1317 bytes at 1509 (TKey 't1' (TTree))
2-element Array{String,1}:
"mynum"
"myval"
julia> f["t1"]["mynum"]
Compressed datastream of 1317 bytes at 6180 (TKey 't1' (TTree))
UnROOT.TBranch
cursor: UnROOT.Cursor
fName: String "mynum"
fTitle: String "mynum/I"
fFillColor: Int16 0
fFillStyle: Int16 1001
fCompress: Int32 101
fBasketSize: Int32 32000
fEntryOffsetLen: Int32 0
fWriteBasket: Int32 1
fEntryNumber: Int64 25
fIOFeatures: UnROOT.ROOT_3a3a_TIOFeatures
fOffset: Int32 0
fMaxBaskets: UInt32 0x0000000a
fSplitLevel: Int32 0
fEntries: Int64 25
fFirstEntry: Int64 0
fTotBytes: Int64 170
fZipBytes: Int64 116
fBranches: UnROOT.TObjArray
fLeaves: UnROOT.TObjArray
fBaskets: UnROOT.TObjArray
fBasketBytes: Array{Int32}((10,)) Int32[116, 0, 0, 0, 0, 0, 0, 0, 0, 0]
fBasketEntry: Array{Int64}((10,)) [0, 25, 0, 0, 0, 0, 0, 0, 0, 0]
fBasketSeek: Array{Int64}((10,)) [238, 0, 0, 0, 0, 0, 0, 0, 0, 0]
fFileName: String ""
julia> seek(f.fobj, 238)
IOStream(<file test/samples/tree_with_histos.root>)
julia> basketkey = UnROOT.unpack(f.fobj, UnROOT.TKey)
UnROOT.TKey64(116, 1004, 100, 0x6526eafb, 70, 0, 238, 100, "TBasket", "mynum", "t1")
julia> s = UnROOT.datastream(f.fobj, basketkey)
Compressed datastream of 100 bytes at 289 (TKey 'mynum' (TBasket))
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=100, maxsize=Inf, ptr=1, mark=-1)
julia> [UnROOT.readtype(s, Int32) for _ in 1:f["t1"]["mynum"].fEntries]
Compressed datastream of 1317 bytes at 6180 (TKey 't1' (TTree))
25-element Array{Int32,1}:
0
1
2
3
4
5
6
7
8
9
10
10
10
10
10
Thanks goes to these wonderful people (emoji key):
Tamas Gal 💻 📖 🚇 🔣 |
Jerry Ling 💻 |
Johannes Schumann 💻 |
Nick Amin 💻 |
Mosè Giordano 🚇 |
Oliver Schulz 🤔 |
Misha Mikhasenko 🔣 |
This project follows the all-contributors specification. Contributions of any kind welcome!