muckrights-sans-merde

 bonum fabula frat

### cleaning-an-html-mess-0-3 previous version: => cleaning-an-html-mess-0-2.html cleaning-an-html-mess-0-2 *originally posted:* jun 2021 UPDATE: note an important bug with every version of this so far, is that it will remove some non-duplicates if they contain html. i thought this would be a non-issue, except it comes up if there is an html link posted by the chatter in that line. what is most likely to trigger this bug? two consecutive urls (with nothing after the url) even if those urls are different. what can fix this? parsing on ">" is insufficient. the parser needs to get slightly more involved to allow consecutive urls / html. (most html other than links should be converted to entities, which is why the parser wasnt concerned this already). either way, duplicates are now handled on the server side, so improving this further would be just for fun. not even loading the code editor. editing the webpage directly. will test, of course. this is a minor tweak upon discovering that quite rarely, some of those trailing spaces are actually &nbsp; lol. this tweak is just for fun, but it works. ``` # cancel consecutive repeated text 0.3 # jun 2021 # license: creative commons cc0 1.0 (public domain) # http://creativecommons.org/publicdomain/zero/1.0/ # "compile" program from code editor: # fig50 ccrt.fig | tail #date #Thu Jun 3 06:30:06 UTC 2021 <- started coding #logic: #1. parsing on when it hits: ">" #2. parsing off when it hits: "</td" # date # Thu Jun 3 17:51:30 UTC 2021 0.2 started # - len / minus 4 / left / rtrim # - if leading "[", locate "]", len / minus location / right / ltrim # date # Thu Jun 3 18:13:52 UTC 2021 0.2 completed #3. compare to buffer #4. if non-match, print #5. copy to buffer #go! p arrstdin prevbuf "" buf "" fb 0 forin lin p forin each lin fb 1 now buf plus each swap now buf # parsing on when it hits: ">" ifequal each ">" buf "" # reset next # parsing off when it hits: "</td" bufright4 buf right 4 lcase ifequal bufright4 "</td" # len / minus 4 / left / rtrim buflen buf len minus 4 now buf left buflen split now "&nbsp;" join now " " rtrim split now " " join now "&nbsp;&nbsp;" swap now buf # if leading "[", locate "]", len / minus location / right / ltrim bufleft buf ltrim left 1 ifequal bufleft "[" bufloc instr buf "]" iftrue bufloc buflen buf len minus bufloc now buf right buflen ltrim swap now buf next next # compare to buffer ifequal buf prevbuf fb 0 else # if non-match, print fb 1 next # copy to buffer prevbuf buf break next next buf "" iftrue fb now lin print next next # date # Thu Jun 3 06:59:43 UTC 2021 # 30 minutes including debugging # (probably 20-25 without the comments) ``` => https://muckrights-sans-merde.neocities.org