Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table to IFeatureLayer #243

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

mathieu17g
Copy link
Collaborator

@mathieu17g mathieu17g commented Sep 8, 2021

Here is a 1st draft to fix issue #152

I have implemented the conversion from table to an IFeatureLayer in a memory dataset.
It is done via an IFeatureLayer constructor but it can be moved to a createlayer method.

Here are some modifications that has / may need to be implemented:

Other additions:

  • added convert methods between OGRwkbGeometryType and IGeometry{OGRwkbGeometryType}. Necessary in column types conversion since it is based on try-catch on convert methods
  • used GDAL.ogr_gfld_setname to set first geomfield name, since setname! is not implemented for FeatureDefn. set name!(::FeatureDefn, ::String) may require to fix issue Model and handle relationships between GDAL objects systematically #215 first
  • fixed @ogrerr: message has to be escaped. This fix may also be necessary in other similar macros (not tested)
  • added convert methods between OGRFieldType and DataType for vectors, Int8 and Float16

Here is a test on multi_geom.csv test file:

julia> ds = AG.read("multi_geom.csv",
           options = [
               "GEOM_POSSIBLE_NAMES=point,linestring",
               "KEEP_GEOM_COLUMNS=NO",
           ],
       )
       layer = AG.getlayer(ds, 0)
       @show layer
       df = DataFrame(layer)
       @show df; println()
       @show describe(df); println()
       new_layer = AG.IFeatureLayer(df)
       @show new_layer
       new_df = DataFrame(new_layer)
       @show new_df; println()
       @show describe(new_df); println()
layer = Layer: multi_geom
  Geometry 0 (point): [wkbUnknown]
  Geometry 1 (linestring): [wkbUnknown]
     Field 0 (id): [OFTString], 5.1, 5.2
     Field 1 (zoom): [OFTString], 1.0, 2.0
     Field 2 (location): [OFTString], Mumbai, New Delhi

df = 2×5 DataFrame
 Row │ point               linestring               id      zoom    location
     │ IGeometr           IGeometry               String  String  String
─────┼────────────────────────────────────────────────────────────────────────
   1 │ Geometry: wkbPoint  Geometry: wkbLineString  5.1     1.0     Mumbai
   2 │ Geometry: wkbPoint  Geometry: wkbLineString  5.2     2.0     New Delhi

describe(df) = 5×7 DataFrame
 Row │ variable    mean     min     median   max        nmissing  eltype
     │ Symbol      Nothing  Union  Nothing  Union     Int64     DataType
─────┼─────────────────────────────────────────────────────────────────────────────────────
   1 │ point                                                   0  IGeometry{wkbPoint}
   2 │ linestring                                              0  IGeometry{wkbLineString}
   3 │ id                   5.1              5.2               0  String
   4 │ zoom                 1.0              2.0               0  String
   5 │ location             Mumbai           New Delhi         0  String

new_layer = Layer: 
  Geometry 0 (point): [wkbPoint]
  Geometry 1 (linestring): [wkbLineString]
     Field 0 (id): [OFTString], 5.1, 5.2
     Field 1 (zoom): [OFTString], 1.0, 2.0
     Field 2 (location): [OFTString], Mumbai, New Delhi

new_df = 2×5 DataFrame
 Row │ point               linestring               id      zoom    location
     │ IGeometr           IGeometry               String  String  String
─────┼────────────────────────────────────────────────────────────────────────
   1 │ Geometry: wkbPoint  Geometry: wkbLineString  5.1     1.0     Mumbai
   2 │ Geometry: wkbPoint  Geometry: wkbLineString  5.2     2.0     New Delhi

describe(new_df) = 5×7 DataFrame
 Row │ variable    mean     min     median   max        nmissing  eltype
     │ Symbol      Nothing  Union  Nothing  Union     Int64     DataType
─────┼─────────────────────────────────────────────────────────────────────────────────────
   1 │ point                                                   0  IGeometry{wkbPoint}
   2 │ linestring                                              0  IGeometry{wkbLineString}
   3 │ id                   5.1              5.2               0  String
   4 │ zoom                 1.0              2.0               0  String
   5 │ location             Mumbai           New Delhi         0  String

@mathieu17g mathieu17g marked this pull request as ready for review September 14, 2021 12:29
@yeesian yeesian self-requested a review September 15, 2021 02:00
@yeesian yeesian requested a review from visr September 30, 2021 04:19
@yeesian yeesian linked an issue Oct 8, 2021 that may be closed by this pull request
@yeesian
Copy link
Owner

yeesian commented Oct 9, 2021

(MAYBE) define a function dataset(::AbstractFeatureLayer) instead of calling ownedby property.

I do think it's a good idea to move away from the ownedby property -- I have ideas/plans of moving away from it in a systematic approach for addressing #171 and #215.

I'll write up about it in a RFC and get thoughts and feedback first.

@yeesian
Copy link
Owner

yeesian commented Oct 9, 2021

Unfortunately, it'll be difficult to review this PR with 38a23ef. If I still remember it right, I'd suggest

  1. keeping a copy of this branch (in case something goes wrong),
  2. deleting the merge commit 38a23ef (e.g. possibly in a git rebase interactive session)
  3. rebasing to master using git at the command line, resolve any conflicts, and
  4. force pushing to this branch

@mathieu17g
Copy link
Collaborator Author

mathieu17g commented Oct 9, 2021

OK as PR #245, I will close this PR and open a new one.
Currently, I have rebase to abort locally that blocks me. I'm digging into it

  1. keeping a copy of this branch (in case something goes wrong),
  2. deleting the merge commit 38a23ef (e.g. possibly in a git rebase interactive session)
  3. rebasing to master using git at the command line, resolve any conflicts, and
  4. force pushing to this branch

Thanks, I did as advised below

@mathieu17g mathieu17g force-pushed the IFeatureLayer_from_table branch 2 times, most recently from 286b10e to bbf3760 Compare October 9, 2021 09:23
Copy link
Owner

@yeesian yeesian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on -- it's ambitious in all the right ways and I really appreciate it!

My comments are mostly around src/tables.jl since it has become pretty complex to follow along, and I think it needs to get to a state where it "feels obvious" for maintainability.

src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
mathieu17g and others added 15 commits October 12, 2021 05:26
…tries, mixed float/int

Test with `nothing` skipped until PR yeesian#238 [Breaking] Return missing if the field is set but null. is merged
Added some test coverage on convert error cases:
-  ogrerr macro error message modification
- IFeature(table) constructor errors on type conversions
Added a test on OGRFieldType error handing
…file, illustrating the data modification that may be induced by the driver limitations (here "ESRI Shapefile")
Adjusted prerequisite
Added layer name option to IFeatureLayer
- no difference for geometry columns. Both `nothing` and `missing` values map to an UNSET geometry field (null pointer)
- field set to NULL for `missing` values and not set for `nothing` values
@mathieu17g mathieu17g force-pushed the IFeatureLayer_from_table branch from d48e4a8 to 1b6ab33 Compare October 12, 2021 03:56
- refactored conversion of table column types to ArchGDAL layer featuredefn's geomfield and field types
- added docstrings
- adjusted convert fonctions between `OGRFieldType`s / `OGRFieldSubType`s and `DataType`s to enable refactoring
Copy link
Owner

@yeesian yeesian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! This PR is shaping up really nicely, and I have a much better grasp of it now :)

The generated functions are really quite magical haha, so my remaining comments are mostly stylistic in nature.

test/test_tables.jl Outdated Show resolved Hide resolved
docs/src/tables.md Outdated Show resolved Hide resolved
docs/src/tables.md Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/types.jl Outdated Show resolved Hide resolved
Copy link
Owner

@yeesian yeesian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think the overall flow is there, but the amount of coherence checks and inference and so on makes me very hesitant about the API. I've suggested some changes to see if they might help keep things simple.

src/ogr/geometry.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
- Moved from `@generated` to normal functions
- Added conversion for `GeoInterface.AbstractGeometryCollection`
- Differentiated display between `IGeometry` and `Geometry`
- Added tests on compact display for `AbstracGeometry` (e.g in use in DataFrames)
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
src/tables.jl Outdated Show resolved Hide resolved
@mathieu17g
Copy link
Collaborator Author

@yeesian I dropped fieldtypes kwarg in src/ and test/ to check if it still works properly before stashing all the WKT/WKB parsing and geomcols kwarg

@mathieu17g
Copy link
Collaborator Author

mathieu17g commented Nov 28, 2021

@yeesian, in commit e8560e4, I dropped geomcols kwarg in src/ and test/

Note: Coverage is still impaired by codecov coverage not handling generated code properly. When I test coverage locally with Pkg.jl and Coverage.jl everything new is covered

@@ -251,26 +252,28 @@ function Base.show(io::IO, spref::AbstractSpatialRef)::Nothing
return nothing
end

function Base.show(io::IO, geom::AbstractGeometry)::Nothing
function Base.show(io::IO, geom::AbstractGeometry, prefix::String)::Nothing
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be named as _show or something, or does it need to have the Base. prefix?

@yeesian
Copy link
Owner

yeesian commented Dec 20, 2021

Wanted to say I like the new API a lot, thank you so much for working through all the review comments!

OFSTInt16::Vector{Int16},
OFSTInt16::Int16, # default type comes last
OFSTNone::Vector{Int32},
OFSTInt16::Float16,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and also Vector{Float16}) is not correct for OFSTInt16 AFAIK

@ErickChacon
Copy link

It would very nice to have this feature. It would make integration with other geometry packages easier. Is there any reason why this effort stopped? Is there still interest on this feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Construct ArchGDAL. IFeatureLayer via named tuples or DataFrame
5 participants