Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show Unicode chars in px #4855

Open
Rot127 opened this issue Jan 21, 2025 · 3 comments
Open

Show Unicode chars in px #4855

Rot127 opened this issue Jan 21, 2025 · 3 comments
Labels
enhancement New feature or request refactor Refactoring requests UX/UI User Interface/User experience

Comments

@Rot127
Copy link
Member

Rot127 commented Jan 21, 2025

Is your feature request related to a problem? Please describe.

When using px (or the interactive version in V) unicode characters are not printed.
This is not so nice, since spotting unicode bytes is not necessarily a thing people can do intuitively.

ps does print unicode.

Example:

rizin test/bins/cmd/search/string_encodings/Arabic-Lipsum.utf_8

:> px 10
- offset -   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF
0x00000000  d8a3 d8b3 d98a d8a7 20d9                 ........ .
:> ps 10
أسيا مساعدة جعل عن, أخذ قد يونيو الثانية, نهاية الإقتصادية أي فقد. كما فسقط يتعلّق محاولات أي, هو الأحمر العمليات تلك, اكتوبر مقاطعة من كلا. هو لان وسفن أسيا الأوضاع, لم بوابة المبرمة عرض. إبّان اسبوعين البشريةً تعد في. كنقطة إيطاليا قام بل, أضف أن وبغطاء الباهضة.\xff\xff\xff
...

Describe the solution you'd like

Print unicode chars instead of dots.

Describe alternatives you've considered

Replace ........ with something like <UNICODE> or similar?

Additional context

Unicode chars are not necessarily mono spaced. This can lead to alignment problems. Depending on the font people use.

It is probably a good idea to solve this problem when we refactor the whole hex-view at some point. Which in turn also requires TUI fixes.

@Rot127 Rot127 added enhancement New feature or request refactor Refactoring requests UX/UI User Interface/User experience labels Jan 21, 2025
@notxvilka
Copy link
Contributor

I am not sure it's a good idea. I think it would fit better in either 1) completely new mode 2) pxa output. The problem that there are many unicode encodings, also plenty of non-unicode ones. Plus the character alignment, and so on.

@well-mannered-goat
Copy link
Contributor

I had faced the same when working on a binary. What if we add a column that prints auto detected string in the hex dump? Although then alignment and legibility would become a problem i guess.

@Rot127
Copy link
Member Author

Rot127 commented Feb 1, 2025

Yes, alignment. And the auto-detect can be miss-leading. UTF-16 produces valid strings for most byte sequences (maybe this problem is solved soon though).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request refactor Refactoring requests UX/UI User Interface/User experience
Projects
None yet
Development

No branches or pull requests

3 participants