Excessive size of generated Swift code #1204

BalestraPatrick · 2022-01-11T14:23:56Z

Hello!

Many parts of our codebase use SwiftProtobuf. Recently we started tracking app size in a more accurate manner and we noticed a trend that is pretty worrying for us. Generated Swift protos code increase our app size a lot. Recently, we removed a single proto file that was about 400 LoC which contained about 70 message definitions (including the various transitive imports) and the generated code was about 5KLoC. 304KB of our app size was attributed to symbols coming from the generated Swift Protobuf code.

We are building with SWIFT_OPTIMIZATION_LEVEL = -Osize in release mode but I wonder if there are other ways to reduce the size of the generated Swift code.

I can't exactly share my full proto, but I was wondering if this is a known issue with Swift or maybe there are ways to reduce the impact of the generated code. Does anyone have experience with this particular issue?

The text was updated successfully, but these errors were encountered:

tbkka · 2022-01-11T17:23:49Z

Code size is a common concern with code-generated approaches such as this. Protobuf implementations for some other languages rely heavily on reflection which makes them smaller but significantly slower.

If you're only using the binary encoding, it should be easy to strip out the field names and other content that's only there to support JSON and TextFormat encoding. Right now, I think this would require a small change to the code generator, but I've long been interested in emitting that content as separate .swift sources that contain only those extensions. It would then be easy to delete those files. (Alternately, we could consider splitting the JSON and TextFormat support into a separate generator.) You could also look critically at whether there are other parts of the generated code that you might omit: For example, the generated == implementations are somewhat bulky and may not be needed in your application.

thomasvl · 2022-01-11T17:30:54Z

fyi - #18 is open for tracking splitting out the textual support.

dflems · 2022-01-12T02:04:04Z

Wrote a little wrapper to patch the generated swift source code to remove conformance to SwiftProtobuf._ProtoNameProviding, which seems to have shaved off about 10% of the total binary size of the generated Swift protobuf in our app (according to the linkmap). Would be nice for this to be an option in the generator for sure!

I briefly looked into removing == as well but _MessageImplementationBase is Hashable so it needs an implementation of it or a change to the runtime.

update: Turns out we're using JSON encoding/decoding a little bit in the codebase and can't merge this, sadly

allevato · 2022-01-12T15:59:49Z

Another improvement I wanted to look at in this area to reduce the amount of code generation was to make serialization and other related functionality (hashing, equatability) table-driven. Unfortunately, the only way to get static arrays of constant data into a data segment is through a SIL transform that only runs on optimized builds, and even when that transform applies is very unpredictable. If it isn't applied, then we'd end up generating code that heap-allocates those arrays and populates them element-by-element, and that code would run the first time a particular message is serialized, parsed, equality-tested, or hashed, which would make client code performance unpredictable in ways that we should probably avoid*.

* To be fair, this is already happening with the name tables we generate for text/JSON serialization, but that's restricted to a much smaller set of serialization operations that are expected to be less efficient than binary format.

Fixes apple#944 Helps with apple#1204

The presence can be checked with `isEmpty`on the Array, and they can be cleared by assigning to `[]`. Fixes apple#944 Helps with apple#1204

The presence can be checked with `isEmpty`on the Array, and they can be cleared by assigning to `[]`. Fixes #944 Helps with #1204

cprovatas · 2022-06-01T20:05:15Z

What if there was a option to opt-in to only one serialization mechanism? Say a client only needs binary encoding / decoding? Would that make any difference in the size of the generated code?

tbkka · 2022-06-01T20:13:07Z

The idea of having an opt-in is a good one, and it's something we've discussed on many occasions. It would certainly make some difference, though someone would have to actually try it and measure to figure out how much savings. But the detailed design is tricky:

Is there a good way to factor/subset the support library?
This could be done with generation options, but we'd need a good way to test every combination to make sure everything still works.
We've also considered the possibility of emitting different serialization support as Swift "extensions" in separate source files (E.g., "MyProto-TextFormatSupport.swift") so people can simply delete capabilities that don't interest them by deleting the associated files.

At this point, I would say that we have lots of good ideas; we really need some folks to actually try implementing some of these ideas and see how well they work out.

thomasvl · 2022-06-07T18:47:56Z

#1240 has a draft of some work I did to split the generated code into what is needed for the just binary, and then extra files needed for the textual formats.

Since a Visitor/Decoder pattern is used by the library, there isn't a lot of code specific to the formats. At the moment, the file numbers and binary encoding information is part of the base generated code, as that's a very small amount of data. The textual support then layers on the needed mapping between field numbers and the names. Since the JSON names can mainly be derived from the TextFormat names; most cases, it means we just need one string and a marker saying we can derive the other one. Splitting that in two completely different things could result in even larger code when folks need both since we'd potential be more verbose instead of allowing things to be derived.

One thing #1240 doesn't yet take on is splitting up the core runtime library so if you don't need the textual formats, you don't have to link that backing code. No effort as been done to see how much that might save/etc. Using that PR as a starting point would likely make some sense to start getting more clarity into what the potential savings would be.

acecilia · 2024-03-19T22:36:25Z

👋 Related with the size of the generated code, the size of the SwiftProtobuf SDK itself is also considerable: 1.4MB for latest version 1.25.2 (this is the size of the binary built statically inside a production app - measured using linkmap).

Adding this comment here with the size information just for context

Lukasa mentioned this issue Jan 11, 2022

Reduce framework size grpc/grpc-swift#1329

Open

thomasvl added a commit to thomasvl/swift-protobuf that referenced this issue Mar 7, 2022

Don't generate has/clear for repeated enums.

9e6dfed

Fixes apple#944 Helps with apple#1204

thomasvl added a commit to thomasvl/swift-protobuf that referenced this issue Mar 7, 2022

Don't generate has/clear for repeated extension fields.

123c538

The presence can be checked with `isEmpty`on the Array, and they can be cleared by assigning to `[]`. Fixes apple#944 Helps with apple#1204

thomasvl added a commit that referenced this issue Apr 5, 2022

Don't generate has/clear for repeated extension fields.

7d6d94c

The presence can be checked with `isEmpty`on the Array, and they can be cleared by assigning to `[]`. Fixes #944 Helps with #1204

bubski mentioned this issue May 2, 2022

Allow opting-out of CustomDebugStringConvertible to reduce binary size #1245

Closed

rockbruno mentioned this issue Oct 12, 2023

Add option to skip Hashable/Equatable conformances #1476

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive size of generated Swift code #1204

Excessive size of generated Swift code #1204

BalestraPatrick commented Jan 11, 2022

tbkka commented Jan 11, 2022

thomasvl commented Jan 11, 2022

dflems commented Jan 12, 2022 •

edited

Loading

allevato commented Jan 12, 2022 •

edited

Loading

cprovatas commented Jun 1, 2022

tbkka commented Jun 1, 2022

thomasvl commented Jun 7, 2022

acecilia commented Mar 19, 2024 •

edited

Loading

Excessive size of generated Swift code #1204

Excessive size of generated Swift code #1204

Comments

BalestraPatrick commented Jan 11, 2022

tbkka commented Jan 11, 2022

thomasvl commented Jan 11, 2022

dflems commented Jan 12, 2022 • edited Loading

allevato commented Jan 12, 2022 • edited Loading

cprovatas commented Jun 1, 2022

tbkka commented Jun 1, 2022

thomasvl commented Jun 7, 2022

acecilia commented Mar 19, 2024 • edited Loading

dflems commented Jan 12, 2022 •

edited

Loading

allevato commented Jan 12, 2022 •

edited

Loading

acecilia commented Mar 19, 2024 •

edited

Loading