muckrights-sans-merde

bonum fabula frat

### cleaning-an-html-mess-in-30-minutes other pages: => sort-of-against-software-tyranny.html sort-of-against-software-tyranny *originally posted:* jun 2021 > is work being done to clean up the HTML logs from duplicates from the bridges? Jun 02 22:12 yeah, this is getting annoying. lets fix it. i had a look at the html, and was satisfied that anything we wanted to compare was between ">" and "</td" which is actually a bit cocky in terms of reliable work, but i was really tired of reading the log the log with a with a fucking fucking echo echo. i compared the filtered log with the unfiltered one, and everything looks good. it only removes lines with consecutive CONTENTS of the buffer, when they perfectly match. the code took 30 minutes including debugging and comments (the comments are a bit much, they correspond to the logic i wrote first) and i was pretty much on automatic the whole time. i paid little attention to style. heres what i wrote: ``` # cancel consecutive repeated text 0.1 # jun 2021 # license: creative commons cc0 1.0 (public domain) # http://creativecommons.org/publicdomain/zero/1.0/ # "compile" program from code editor: # fig50 ccrt.fig | tail #date #Thu Jun 3 06:30:06 UTC 2021 <- started coding #logic: #1. parsing on when it hits: ">" #2. parsing off when it hits: "</td" #3. compare to buffer #4. if non-match, print #5. copy to buffer #go! p arrstdin prevbuf "" buf "" fb 0 forin lin p forin each lin fb 1 now buf plus each swap now buf # parsing on when it hits: ">" ifequal each ">" buf "" # reset next # parsing off when it hits: "</td" bufright4 buf right 4 lcase ifequal bufright4 "</td" # compare to buffer ifequal buf prevbuf fb 0 else # if non-match, print fb 1 next # copy to buffer prevbuf buf break next next buf "" iftrue fb now lin print next next # date # Thu Jun 3 06:59:43 UTC 2021 # 30 minutes including debugging # (probably 20-25 without the comments) ``` i noticed that at least one person wasnt having their duplicates filtered, because there was one trailing space after half their posts, before the closing table data tag. i wanted to put the 30 minute version up, because it already does exactly what it was designed to do-- design-wise its not a bug, its a tweak-- but it should be fixable by cutting the trailing 4 characters from the buffer with len / minus 4 / left and trimming space with rtrim. if i fix that soon i will update here. i have yet to determine if i should "ignore" (filter) the tr-bridge user. it looks like i could do that, but im surprised that more than one class of duplicate exists to be filtered. => https://muckrights-sans-merde.neocities.org

license: creative commons cc0 1.0 (public domain)
https://creativecommons.org/publicdomain/zero/1.0/