Dealing with strings that contain ansi codes for colorizing can be a challenge especially if you need the actual display length of the embedded strings.
Here is a quick-n-dirty function that can either exctract all the ansi_codes and provide them to you or just retrun the embedded string(s) or just return the length of the line without the ansi_codes.
# ####################################
def parse_codes(msg, rtrn_type='dict', *args, **kwargs):
# ################################
"""
purpose: grab all ansi_codes from a string and return them - see below
requires:
- msg: str # line string that may or may not contain ansi_codes, codes are separated from the message string
options:
- rtrn: str # default='dict' a dictionary will be returned = {'codes': ['code1', 'code2', 'code3'...], 'strings': 'str1', 'str2', 'str3', 'str4',...]}
# rtrn='list' returns the list of codes
# rtrn='string'|'str' returns string(s) joined as one string without codes
# rtrn='nclen'|'len' returns only thr length of the (joined) string without and codes
returns:
- default = {'codes': ['code1', 'code2',...], 'strings': 'str1', 'str2', ...]}
- rtrn='list' = ['code1', 'code2', ...]
- rtrn='string' = "".joined(['str1', 'str2', ...])
- rtrn='len' = len("".joined(['str1', 'str2', ...]))
notes:
- this replaces the decrecated split_codes(), escape_ansi(), and nclen()
- WIP
- 20250826-1606
"""
# """--== Local Import(s) ==--""" #
# """--== Config ==--""" #
rtrn_type = arg_val(["rtrn", "type", "rtrntype","rtrn_type"], args, kwargs, dflt=rtrn_type)
# """--== Init ==--""" #
pat = r"\x1b\[[0-9;]+m"
msg = str(msg).strip()
# msg = clr_coded(msg)
rtrn = None
codes_l = ["",""]
# """--== Process ==--""" #
codes_l = re.findall(pat, msg):
if not codes_l:
codes_l = ["",""]
elems_l = re.split(pat, msg) # <--- fails but why?
if len(elems_l) > 2:
elems_l = elems_l[1:-1] # I do not know why but it always seems to have '' both as the first and last elem
# """--== Prepare rtrn ==--""" #
if rtrn_type in ('dict', "d", "dictionary"): # default
rtrn = {'codes': codes_l, 'strings': elems_l}
if rtrn_type in ('list', 'lst', 'l'):
rtrn = codes_l
if rtrn_type in ('string', 'str', 'joined', 'stripped', 'nocode', 'escaped'):
rtrn = "".join(elems_l)
if rtrn_type in ('nclen', 'len'):
rtrn = len("".join(elems_l))
# """--== return ==--""" #
# print(f"{rtrn_type=} {rtrn=}") # for debugging
return rtrn
This call arg_val() which returns the value for any rtrn_tye=‘xxx’. How that function works is aluded to in previous posts but I might go back and revisit this matter in the future.
The line that calls clr_codes(msg) has been commented out here. In my use that line calls another function that translates embedded color tags into ansi codes. For example, a string like this “This [blue]is blue[/] and this [red]is red[/]” gets translated into ‘This \x1b[34mis blue\x1b[0m and this \x1b[31mis red\x1b[0m’. Perhaps in a future post I will show how clr_coded works but that will be a lengthy post.
Thanks for reading…
Contact me if you have any questions.
Enjoy!
comments powered by Disqus