Regex glossaries?
Thread poster: Piotr Bienkowski
Piotr Bienkowski
Piotr Bienkowski  Identity Verified
Poland
Local time: 05:52
English to Polish
+ ...
Jan 28, 2016

I want to learn something new but can't find the resources.

I want to use the regex glossary feature to be able to quickly insert all uppercase words such as ZORDRETSPECSALES. So I created a glossary and entered

[A-Z]+{tab}$0

Reloaded the glossary. Nothing happens i.e. the uppercase word is not highlighted. Regex wrong or something else?

{tab} represents the tab character.


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 05:52
English to Hungarian
+ ...
? Jan 28, 2016

I don't know CT so I can't offer much help, I'm sure CT users will show up with insider info any minute now.
Still:
1) A-Z usually just matches latin characters, i.e. it won't match É or Ł. This varies tool to tool and should be tested. Possible expressions for matching all uppercase letters include [[:upper:]]+ and \p{Uppercase}+. Check the CT docs for info.
2) What's the $0? $ usually stands for "end of string". $1 stands for "previous captured string" but that would requir
... See more
I don't know CT so I can't offer much help, I'm sure CT users will show up with insider info any minute now.
Still:
1) A-Z usually just matches latin characters, i.e. it won't match É or Ł. This varies tool to tool and should be tested. Possible expressions for matching all uppercase letters include [[:upper:]]+ and \p{Uppercase}+. Check the CT docs for info.
2) What's the $0? $ usually stands for "end of string". $1 stands for "previous captured string" but that would require you to put parens around something before. E.g. ([A-Z]+){tab}$1

[Edited at 2016-01-28 16:13 GMT]
Collapse


 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 06:52
English to Turkish
+ ...
nontranslatables Jan 28, 2016

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([
... See more
Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0
Collapse


 
Piotr Bienkowski
Piotr Bienkowski  Identity Verified
Poland
Local time: 05:52
English to Polish
+ ...
TOPIC STARTER
I copied you list.... Jan 28, 2016

Closed the project and reopened it, but only completely restarting Cafetran did the trick. Now I don't know if it is your list that works or my regex glossary

Selcuk Akyuz wrote:

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0





 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 06:52
English to Turkish
+ ...
my list possibly Jan 28, 2016

You can test it, simply close your glossary and if they are still highlighted then it is proof that nontranslatables are working.

A version of your regex ([A-Z]+) or ([A-Z][A-Z]+) don't remember which one now, highlights matches but you should right click on glossary and select match case.

I could not find a regex to highlight É, the following one failed:

|\b[\p{Upper}]\b

-----------------------

Found it...
See more
You can test it, simply close your glossary and if they are still highlighted then it is proof that nontranslatables are working.

A version of your regex ([A-Z]+) or ([A-Z][A-Z]+) don't remember which one now, highlights matches but you should right click on glossary and select match case.

I could not find a regex to highlight É, the following one failed:

|\b[\p{Upper}]\b

-----------------------

Found it

|\b[\p{Lu}]\b






[Edited at 2016-01-28 20:32 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Regex glossaries?






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »