regex help

classic Classic list List threaded Threaded
5 messages Options
williamdrescher williamdrescher
Reply | Threaded
Open this post in threaded view
|

regex help

I have a script that mysteriously became double spaced.
I want to search for 2 end of paragraph marks and replace them
with one.  I can do a regex search for $ and find them all, but
$$ finds none - I presume there is something between them that
does not show as a formatting mark.
I don't want to remove all blank lines, just the ones that have a
blank line following.

Or, any other suggestions.

What is the regex for a paragraph mark in replace?

thank in advance for the help.

-bill


--
To unsubscribe e-mail to: [hidden email]
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy
Brian Barker Brian Barker
Reply | Threaded
Open this post in threaded view
|

Re: regex help

At 14:19 03/01/2021 -0500, William Noname wrote:
>I have a script that mysteriously became double spaced. I want to
>search for 2 end of paragraph marks and replace them with one.

So it's the paragraphs that are spaced, not lines (the usual meaning
of "double-spaced")?

>I can do a regex search for $ and find them all, but $$ finds none -
>I presume there is something between them that does not show as a
>formatting mark.

You are labouring under the common misapprehension that the pilcrow
that indicates the presence of a paragraph break has an actual
presence in the document; no: think of it merely as the indicator
that it is. And you think that the $ symbol matches this; also no.

>I don't want to remove all blank lines, just the ones that have a
>blank line following.

In other words you want to remove blank empty *paragraphs*.

>What is the regex for a paragraph mark in replace?

There isn't one. Instead, the $ symbol merely locks your pattern,
whatever it is, so that it will match only something that occurs at
the end of a paragraph.

>Or, any other suggestions.

Don't think of searching for paragraph breaks, which you cannot do.
Instead, search for the empty paragraphs that you wish to delete.
They contain nothing, so the pattern you need is also nothing. But
you need to arrange that it matches nothing only if that nothing
occurs at the beginning of a paragraph and at the same time at the
end of a paragraph. The symbol for locking to the start of a
paragraph is the caret, "^", so the pattern you should search for is
"^$" (no quotes, of course) - replacing with nothing.

I trust this helps.

Brian Barker


--
To unsubscribe e-mail to: [hidden email]
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy

williamdrescher williamdrescher
Reply | Threaded
Open this post in threaded view
|

Re: regex help

In reply to this post by williamdrescher
On 1/3/2021 4:09 PM, Brian Barker wrote:
> At 14:19 03/01/2021 -0500, William Noname wrote:
>> I have a script that mysteriously became double spaced. I want
>> to search for 2 end of paragraph marks and replace them with one.
>
> So it's the paragraphs that are spaced, not lines (the usual
> meaning of "double-spaced")?

Yes, I do mean lines are double spaced, but...

I can't do what I want to do in my programming editor and when I
read the script into LO the line endings are suppressed and each
line is treated as a paragraph.

>
>> I can do a regex search for $ and find them all, but $$ finds
>> none - I presume there is something between them that does not
>> show as a formatting mark.
>
> You are labouring under the common misapprehension that the
> pilcrow that indicates the presence of a paragraph break has an
> actual presence in the document; no: think of it merely as the
> indicator that it is. And you think that the $ symbol matches
> this; also no.
Thank you.
>
>> I don't want to remove all blank lines, just the ones that
>> have a blank line following.
>
> In other words you want to remove blank empty *paragraphs*.


Yes, but I have intentional blank lines that I do not want to remove.

>
>> What is the regex for a paragraph mark in replace?
>
> There isn't one. Instead, the $ symbol merely locks your
> pattern, whatever it is, so that it will match only something
> that occurs at the end of a paragraph.
>
>> Or, any other suggestions.
>
> Don't think of searching for paragraph breaks, which you cannot
> do. Instead, search for the empty paragraphs that you wish to
> delete. They contain nothing, so the pattern you need is also
> nothing. But you need to arrange that it matches nothing only
> if that nothing occurs at the beginning of a paragraph and at
> the same time at the end of a paragraph. The symbol for locking
> to the start of a paragraph is the caret, "^", so the pattern
> you should search for is "^$" (no quotes, of course) -
> replacing with nothing.
Is there a way to select a paragraph followed by an empty paragraph?
>
> I trust this helps.
>
> Brian Barker
>

--
Bill Drescher
william {at} TechServSys {dot} com


--
To unsubscribe e-mail to: [hidden email]
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy
Brian Barker Brian Barker
Reply | Threaded
Open this post in threaded view
|

Re: regex help

At 07:29 04/01/2021 -0500, Bill Drescher wrote:
>On 1/3/2021 4:09 PM, Brian Barker wrote:
>>At 14:19 03/01/2021 -0500, Bill Drescher wrote:
>>>I have a script that mysteriously became double spaced. I want to
>>>search for 2 end of paragraph marks and replace them with one.
>>
>>So it's the paragraphs that are spaced, not lines (the usual
>>meaning of "double-spaced")?
>
>Yes, I do mean lines are double spaced, but...

If the lines of a paragraph really are double-spaced, you need to go
to the paragraph or paragraph style formatting and change "Line
spacing" from Double back to Single. Problem solved. In that case,
you would not be - as you suggested - looking for anything to search
for. If that is not your solution, you don't have double spacing and
it would help you to understand what you do have (and indeed to
explain your problem on the list) by using the appropriate terms.
Remember that the concept of "lines" as you appear to be using it
went out with lined manuscript paper or typewriters.

>I can't do what I want to do in my programming editor and when I
>read the script into LO the line endings are suppressed and each
>line is treated as a paragraph.

If each line is a separate paragraph but they are still spaced too
much, either you have empty paragraphs (not "lines") between your
real paragraphs or you have paragraph spacing (in paragraph or
paragraph style formatting) that you don't want. Remove paragraph
spacing there.

>>>I don't want to remove all blank lines, just the ones that have a
>>>blank line following.
>>
>>In other words you want to remove blank empty *paragraphs*.
>
>Yes, but I have intentional blank lines that I do not want to remove.

The only "blank lines" are created if you have consecutive *line
breaks* between text. Is that really what you have?

Oh, and if you are describing the so-called "blank lines" you want to
remove the same way as the "blank lines" you want to preserve, how
are we to know - indeed, how is LibreOffice to know - which is which?
There has to be a specified difference if either a machine or an
earnest human can distinguish them.

>Is there a way to select a paragraph followed by an empty paragraph?

I'm not sure there is. But how would that help you?

Here's a thought. (I'm guessing.) Do you actually have text that
should run on within paragraphs but has somehow become separated such
that each line of any paragraph has become a separate paragraph? If
so, you may be able to reassemble paragraphs using AutoCorrect with
"Combine single line paragraphs if length greater than ..." to a
suitably small value.

Should you be sending a sample document to someone (or to the list if
it would be accepted) for diagnosis?

I trust this helps.

Brian Barker


--
To unsubscribe e-mail to: [hidden email]
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy

williamdrescher williamdrescher
Reply | Threaded
Open this post in threaded view
|

Re: regex help

In reply to this post by williamdrescher
Thank you very much Brian.

I decided that it would be easier to be the "earnest human" and
manually delete the extraneous lines(paragraphs).

Maybe when I have more time I will come back to this and post an
example to the list.

--

Bill Drescher
william {at} TechServSys {dot} com


--
To unsubscribe e-mail to: [hidden email]
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy