asterisk

kenmirrors/asterisk

Fork 0

mirror of https://github.com/asterisk/asterisk.git synced 2025-09-02 19:16:15 +00:00

Commit Graph

Author	SHA1	Message	Date
George Joseph	25f7753f46	res_pjsip: Replace invalid UTF-8 sequences in callerid name * Added a new function ast_utf8_replace_invalid_chars() to utf8.c that copies a string replacing any invalid UTF-8 sequences with the Unicode specified U+FFFD replacement character. For example: "abc\xffdef" becomes "abc\uFFFDdef". Any UTF-8 compliant implementation will show that character as a � character. * Updated res_pjsip:set_id_from_hdr() to use ast_utf8_replace_invalid_chars and print a warning if any invalid sequences were found during the copy. * Updated stasis_channels:ast_channel_publish_varset to use ast_utf8_replace_invalid_chars and print a warning if any invalid sequences were found during the copy. ASTERISK-27830 Change-Id: I4ffbdb19c80bf0efc675d40078a3ca4f85c567d8	2023-03-01 09:50:02 -06:00
Sean Bright	7d96b3e437	utf8.c: Add UTF-8 validation and utility functions There are various places in Asterisk - specifically in regards to database integration - where having some kind of UTF-8 validation would be beneficial. This patch adds: * Functions to validate that a given string contains only valid UTF-8 sequences. * A function to copy a string (similar to ast_copy_string) stopping when an invalid UTF-8 sequence is encountered. * A UTF-8 validator that allows for progressive validation. All of this is based on the excellent UTF-8 decoder by Björn Höhrmann. More information is available here: https://bjoern.hoehrmann.de/utf-8/decoder/dfa/ The API was written in such a way that should allow us to replace the implementation later should we determine that we need something more comprehensive. Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9	2020-07-28 09:45:29 -05:00

Author

SHA1

Message

Date

George Joseph

25f7753f46

res_pjsip: Replace invalid UTF-8 sequences in callerid name

* Added a new function ast_utf8_replace_invalid_chars() to
  utf8.c that copies a string replacing any invalid UTF-8
  sequences with the Unicode specified U+FFFD replacement
  character.  For example:  "abc\xffdef" becomes "abc\uFFFDdef".
  Any UTF-8 compliant implementation will show that character
  as a � character.

* Updated res_pjsip:set_id_from_hdr() to use
  ast_utf8_replace_invalid_chars and print a warning if any
  invalid sequences were found during the copy.

* Updated stasis_channels:ast_channel_publish_varset to use
  ast_utf8_replace_invalid_chars and print a warning if any
  invalid sequences were found during the copy.

ASTERISK-27830

Change-Id: I4ffbdb19c80bf0efc675d40078a3ca4f85c567d8

2023-03-01 09:50:02 -06:00

Sean Bright

7d96b3e437

utf8.c: Add UTF-8 validation and utility functions

There are various places in Asterisk - specifically in regards to
database integration - where having some kind of UTF-8 validation would
be beneficial. This patch adds:

* Functions to validate that a given string contains only valid UTF-8
  sequences.

* A function to copy a string (similar to ast_copy_string) stopping when
  an invalid UTF-8 sequence is encountered.

* A UTF-8 validator that allows for progressive validation.

All of this is based on the excellent UTF-8 decoder by Björn Höhrmann.
More information is available here:

    https://bjoern.hoehrmann.de/utf-8/decoder/dfa/

The API was written in such a way that should allow us to replace the
implementation later should we determine that we need something more
comprehensive.

Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9

2020-07-28 09:45:29 -05:00

2 Commits