[erlang-questions] How to extract string between XML tags

Hugo Mills hugo@REDACTED
Sat Sep 29 17:16:58 CEST 2018


   Note that this only works if there's no nested tags of the same
type. For example, it'll get this wrong:

<b>Part of a <b>nested tag</b>...</b>

   (And there's *no* regex that can get this right in general)

   Hugo.

On Sat, Sep 29, 2018 at 11:11:19AM -0400, Lloyd R. Prentice wrote:
> Thanks, Eric!
> 
> Best wishes,
> 
> Lloyd
> 
> Sent from my iPad
> 
> > On Sep 29, 2018, at 5:13 AM, PAILLEAU Eric <eric.pailleau@REDACTED> wrote:
> > 
> > hello,
> > sorry did not see this question before.
> > 
> > A simple regexp is possible "<\/?[^>]{1,}>"
> > 
> > re:replace("<th>title <b>bold</b></th>","<\/?[^>]{1,}>","", [global, {return, list}]).
> > "title bold"
> > 
> > 
> > 
> >> Le 25/09/2018 à 23:56, lloyd@REDACTED a écrit :
> >> Hello,
> >> By now I should know how to do this. But I've fumbled for more time than I have to find an elegant solution.
> >> Can anyone show a better way?
> >> Example string: "<th>Firstname</th>"  % NOTE: could be any valid tag
> >> My kludge:
> >> extract_text(TaggedText) ->
> >>   Split = re:split(TaggedText, "<"),
> >>   Split2 = lists:nth(2, Split),
> >>   Split3 = binary_to_list(Split2),
> >>   Split4 = re:split(Split3, ">"),
> >>   Split5 = lists:nth(2, Split4),
> >>   binary_to_list(Split5).
> >> Surely there's a better way.
> >> Many thanks,
> >> LRP
> >> *********************************************
> >> My books:
> >> THE GOSPEL OF ASHES
> >> http://thegospelofashes.com
> >> Strength is not enough. Do they have the courage
> >> and the cunning? Can they survive long enough to
> >> save the lives of millions?
> >> FREEIN' PANCHO
> >> http://freeinpancho.com
> >> A community of misfits help a troubled boy find his way
> >> AYA TAKEO
> >> http://ayatakeo.com
> >> Star-crossed love, war and power in an alternative
> >> universe
> >> Available through Amazon or by request from your
> >> favorite bookstore
> >> **********************************************
> >> _______________________________________________
> >> erlang-questions mailing list
> >> erlang-questions@REDACTED
> >> http://erlang.org/mailman/listinfo/erlang-questions
> > 
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-- 
Hugo Mills             | IoT: The S stands for Security.
hugo@REDACTED carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20180929/49a8d1a7/attachment.bin>


More information about the erlang-questions mailing list