Bug Tracker 
| ID | 205🔗 | 
|---|---|
| Date: | 2022-06-10 13:34:21 | 
| Last update: | 2022-10-19 20:07:48 | 
| Status | Closed (Fixed) | 
| Category | glossaries-extra | 
| Version | 1.48 | 
| Summary | Entry counting and entry resetting | 
Sign in to subscribe to notifications about this report.
Description
Hi Nicola,This is from [TeX.SX Link].
I'm a bit unsure about this one, since I might be missing something about why things are how they are (I'm still getting acquainted with the package) but, that disclaimer given, I thought best to bring it to your consideration, in the spirit of a false positive being somewhat preferable to a false negative.
That said, to the point. Both \@gls@write@entrycounts and \@gls@write@entryunitcounts write the counting information to the .aux file \AtEndDocument looping over all defined entries, but only writing anything conditioned on the entry being unset/used at that point (test \ifglsused{\@glsentry}). However, there are a number of reasons why an entry might have been reset along the document and, if it happens that for any given entry it is not unset again after that, no information for that entry is stored, and this entry is considered as "never having been unset" on the next run (count 0, and for every unit if unit counting is in use), with the implications of this. I'm not sure why the counting data is not unconditionally stored for all entries. If it is just to spare computational cost, I'd personally be inclined to prefer paying this cost. (Below I bring an use case where the procedure represents a problem).
A second thing is that the entry counting (by first used flag) is documented to be meant to count "the total number of times \glsunset is used", however this may no longer hold if \glsreset is used. True, this is documented with "\glsreset resets the count and is best avoided". And this is something where I may really be missing the point, but still, why reset the count? It is not like this is a behavior of the "original" \glsreset which brings an adverse effect. The package actually redefines it when enabling entry counting to reset the count. And then document "best avoid" using them together because of this? As far as I understand, the resetting of the count and the resetting of the entry are conceptually different things, despite the common verb. Why not separate the operations and offer something like \glsresetentrycount?
A use case where the above characteristics of the entry counting machinery complicates things is the one mentioned in the link above. I was trying to achieve the behavior "long form for single use, long (short) form for first use, short form for subsequent uses", but that for each chapter. I had found [TeX.SX Link] for resetting entries each chapter, and thought to combine that with entry unit counting. And came up with something along the lines of the provided MWE.
Now, this document results in only GHI reaching the list of abbreviations and any and all occurrences of every chapter except the last to be typeset in long form. The reason in this case is the count reset done with the entry resetting. Since we are calling \glsresetall in the cmd/chapter/before hook, and this before \refstepcounter has been called, we actually reset the count of the chapter which is ending at that point, for all chapters, except the last. Indeed, the stored data is:
\@gls@entry@unitcount{ABC}{0}{chapter.1}
\@gls@entry@unitcount{ABC}{0}{chapter.2}
\@gls@entry@unitcount{ABC}{1}{chapter.3}
\@gls@entry@unitcount{DEF}{0}{chapter.1}
\@gls@entry@unitcount{DEF}{0}{chapter.2}
\@gls@entry@unitcount{DEF}{1}{chapter.3}
\@gls@entry@unitcount{GHI}{0}{chapter.1}
\@gls@entry@unitcount{GHI}{0}{chapter.2}
\@gls@entry@unitcount{GHI}{3}{chapter.3}
We could circumvent this problem by creating a dedicated counter with something like:
\newcounter{myuniquechapter}
\GlsXtrEnableEntryUnitCounting{abbreviation}{1}{myuniquechapter}
\AddToHook{cmd/chapter/before}{\stepcounter{myuniquechapter}\glsresetall}
so that we can do the resetting of the entries after the one of the counter, and we don't erase the previous chapter data.That would get us:
\@gls@entry@unitcount{ABC}{3}{myuniquechapter.2}
\@gls@entry@unitcount{ABC}{1}{myuniquechapter.3}
\@gls@entry@unitcount{ABC}{1}{myuniquechapter.4}
\@gls@entry@unitcount{DEF}{1}{myuniquechapter.2}
\@gls@entry@unitcount{DEF}{3}{myuniquechapter.3}
\@gls@entry@unitcount{DEF}{1}{myuniquechapter.4}
\@gls@entry@unitcount{GHI}{1}{myuniquechapter.2}
\@gls@entry@unitcount{GHI}{1}{myuniquechapter.3}
\@gls@entry@unitcount{GHI}{3}{myuniquechapter.4}
And the document looks good. However, this is just a lucky case. Suppose ABC and DEF had not actually been referenced in the last chapter. In this case, the conditional writing of the data to the .aux file would kick in, and we would get:
\@gls@entry@unitcount{GHI}{1}{myuniquechapter.2}
\@gls@entry@unitcount{GHI}{1}{myuniquechapter.3}
\@gls@entry@unitcount{GHI}{3}{myuniquechapter.4}
and a document equal to the one we had when using chapter.Indeed, it can be worse. We can create a looping document, which does not converge.
Get the modified MWE (our best result thus far) and just move \printabbreviations to the very end of the document. Clear all auxiliary files for a clean slate.
\documentclass[oneside]{book}
\usepackage[
  abbreviations,
  shortcuts=abbr,
]{glossaries-extra}
\makeglossaries
\newcounter{myuniquechapter}
\GlsXtrEnableEntryUnitCounting{abbreviation}{1}{myuniquechapter}
\AddToHook{cmd/chapter/before}{\stepcounter{myuniquechapter}\glsresetall}
\newabbreviation{ABC}{ABC}{Aaaa Bbbb Cccc}
\newabbreviation{DEF}{DEF}{Dddd Eeee Ffff}
\newabbreviation{GHI}{GHI}{Gggg Hhhh Iiii}
\begin{document}
\chapter{Chapter 1}
\ab{ABC}, \ab{ABC}, \ab{ABC}
\ab{DEF}
\ab{GHI}
\chapter{Chapter 2}
\ab{ABC}
\ab{DEF}, \ab{DEF}, \ab{DEF}
\ab{GHI}
\chapter{Chapter 3}
\ab{ABC}
\ab{DEF}
\ab{GHI}, \ab{GHI}, \ab{GHI}
\printabbreviations
\end{document}
Run pdflatex (2x), makeglossaries, pdflatex (2x).  We get to what we may call "state 1", the abbreviations in the document look OK, but there is no list of abbreviations at all (I don't know why the boilerplate gets missing in this case). The count data looks good too:
\@gls@entry@unitcount{ABC}{3}{myuniquechapter.1}
\@gls@entry@unitcount{ABC}{1}{myuniquechapter.2}
\@gls@entry@unitcount{ABC}{1}{myuniquechapter.3}
\@gls@entry@unitcount{DEF}{1}{myuniquechapter.1}
\@gls@entry@unitcount{DEF}{3}{myuniquechapter.2}
\@gls@entry@unitcount{DEF}{1}{myuniquechapter.3}
\@gls@entry@unitcount{GHI}{1}{myuniquechapter.1}
\@gls@entry@unitcount{GHI}{1}{myuniquechapter.2}
\@gls@entry@unitcount{GHI}{3}{myuniquechapter.3}
Now, run again makeglossaries, pdflatex (2x). And we get to "state 2". The list of abbreviations now exists, and contains all entries. But every abbreviation in the document is typeset in long form. Indeed, the count data is completely missing from the .aux file. As a result of \glsresetall being called in the \chapter* of the list of abbreviations and the conditional writing of the data.Now, run again makeglossaries, pdflatex (2x). And we are back to "state 1". Since the list of abbreviations goes rogue, it no longer resets all entries, so the data gets stored, and that's why the counting appears to work. And so on.
Granted, I'm going against your documented advice in these use cases. :)
But couldn't these play better together?
Best regards,
Gustavo.
MWE
Download (617B)
\documentclass[oneside]{book}
\usepackage[
  abbreviations,
  shortcuts=abbr,
]{glossaries-extra}
\makeglossaries
\GlsXtrEnableEntryUnitCounting{abbreviation}{1}{chapter}
\AddToHook{cmd/chapter/before}{\glsresetall}
\newabbreviation{ABC}{ABC}{Aaaa Bbbb Cccc}
\newabbreviation{DEF}{DEF}{Dddd Eeee Ffff}
\newabbreviation{GHI}{GHI}{Gggg Hhhh Iiii}
\begin{document}
\printabbreviations
\chapter{Chapter 1}
\ab{ABC}, \ab{ABC}, \ab{ABC}
\ab{DEF}
\ab{GHI}
\chapter{Chapter 2}
\ab{ABC}
\ab{DEF}, \ab{DEF}, \ab{DEF}
\ab{GHI}
\chapter{Chapter 3}
\ab{ABC}
\ab{DEF}
\ab{GHI}, \ab{GHI}, \ab{GHI}
\end{document}
Evaluation
It is documented that resetting the first use flag will cause interference. A more robust method is to use record counting with bib2gls instead. However, it is possible to prevent \glsreset from resetting the entry counter by adding the following after \GlsXtrEnableEntryUnitCounting:
\renewcommand*{\glsxtrpostreset}[1]{%
  \csuse{@glsxtr@entrycount@org@reset}{#1}%
}%
Update 2022-06-17: The pending v1.49 has a new conditional \ifglsxtrresetcurrcount that determines whether or not to reset the count to 0 when the first use flag is reset. So you'll be able to switch off this behaviour with \glsxtrresetcurrcountfalse.
Update 2022-10-19: correction, the conditional is called \ifglsresetcurrcount and is provided by glossaries v4.50 (but will also be provided by glossaries v1.49 if an older version of glossaries is detected). The default setting is now false, which means the count won't be reset.
Comments
12 comments.
Date: 2022-06-10 14:37:18
Repying to: anonymous 2022-06-10 14:29:06
I'm sorry if I sounded a little terse. I've spent some months working on a major new version, and I'm getting quite tired and need to move onto more urgent tasks. If I get time I'll look into it before I release the pending version otherwise it will have to wait until the next version.
Date: 2022-06-10 14:51:51
Repying to: Nicola Talbot 🦜 2022-06-10 14:37:18
Oh, I'm sorry if I sounded like complaining.
I was not "picked". I was really just trying to get the point across, precisely because I know your time is scarce, and I'm aware the report was long.
So, I have absolutely no issue with however you choose to deal with the report.
Given your comment, if I may suggest, perhaps you might wish to separate the two things. Altogether, or just their "timing". I'd say the conditional writing to the .aux file is probably the thing with more relevant implications and, as far as I can see, storing all the data would be a smooth move. Besides there's no easy workaround at hand. Not resetting the count on \glsreset, on the other hand, while I do think would be a good idea, does have clear backward compatibility consequences. And, as you mentioned in the evaluation of the report, there's an easy workaround.
Again, all for your use as you see fit, when you see fit, if you see fit. I'm here in the contributing spirit, not the complaining one. :)
Date: 2022-06-10 14:53:15
Repying to: anonymous 2022-06-10 14:51:51
Thank you :-)
Date: 2022-06-10 15:01:17
Repying to: Nicola Talbot 🦜 2022-06-10 14:53:15
I thank you! :-)
Date: 2022-06-17 14:44:36
Hi Nicola,
reporting on v1.48b testing, as requested.
I've seen the new \ifglsxtrresetcurrcount and corresponding switches, tried them out, and it's looking good. Thank you very much!
(I'm still hoping you consider the case of the data storage at end of document though).
Best regards,
Gustavo.
Date: 2022-06-18 04:32:37
Hi Nicola,
I've been thinking again about this one, particularly my suggestion to store data of all entries at end of document. And, reconsidering, I can see cases where this might not be a good idea. If a user has a large set of entries for multiple documents, and uses a fraction of those in a given document, it would not be a good idea to store data on every defined entry. It appears to be a common practice, but I didn't think of it in suggesting. If that was what was holding your hand on the matter, you got a point.
But I think I had a little interesting idea for the problem. Considering that the variable which stores the count for a given entry is created on the fly as needed, we can know if a given entry "has been counted" in a document by testing the existence of the variable. For unit counting the following appears to work:
\documentclass[oneside]{book}
\usepackage[
  abbreviations,
  shortcuts=abbr,
]{glossaries-extra}
\makeglossaries
\GlsXtrEnableEntryUnitCounting{abbreviation}{1}{chapter}
\AddToHook{cmd/chapter/before}{\glsresetall}
\glsxtrresetcurrcountfalse
\makeatletter
\renewcommand*{\@gls@write@entryunitcounts}{%
  \immediate\write\@auxout
    {\string\providecommand*{\string\@gls@entry@unitcount}[3]{}}%
  \count@=0\relax
  \forallglsentries{\@glsentry}{%
    \glshasattribute{\@glsentry}{unitcount}%
    {%
      \ifcsundef{glo@\glsdetoklabel{\@glsentry}@unitlist}%
      {}%
      {%
        \forlistcsloop
          {\@gls@write@entryunitcounts@do}%
          {glo@\glsdetoklabel{\@glsentry}@unitlist}%
      }%
      \advance\count@ by \@ne
    }%
    {}%
  }%
  \ifnum\count@=0
    \GlossariesExtraWarningNoLine{Entry counting has been enabled
     \MessageBreak with \string\glsenableentryunitcount\space but the
     \MessageBreak attribute `unitcount' hasn't
     \MessageBreak been assigned to any of the defined
     \MessageBreak entries}%
  \fi
}
\makeatother
\newabbreviation{ABC}{ABC}{Aaaa Bbbb Cccc}
\newabbreviation{DEF}{DEF}{Dddd Eeee Ffff}
\newabbreviation{GHI}{GHI}{Gggg Hhhh Iiii}
\newabbreviation{JKL}{JKL}{Jjjj Kkkk Llll}
\begin{document}
\chapter{Chapter 1}
\ab{ABC}, \ab{ABC}, \ab{ABC}
\ab{DEF}
\ab{GHI}
\chapter{Chapter 2}
\ab{ABC}
\ab{DEF}, \ab{DEF}, \ab{DEF}
\ab{GHI}
\chapter{Chapter 3}
\ab{ABC}
\ab{DEF}
\ab{GHI}, \ab{GHI}, \ab{GHI}
\printabbreviations
\end{document}
This document stores in the .aux file the data of all entries which have been referenced along the document, even if they all have been reset at the hook of \printabbreviations, but does not store data for JKL, which has not been referenced.Something similar could be done for \@gls@write@entrycounts, but using glo@\glsdetoklabel{#1}@currcount instead.
Well, just an idea, for your consideration.
Best regards,
Gustavo.
Date: 2022-06-22 12:38:00
Repying to: anonymous 2022-06-18 04:32:37
I’m very reluctant to add extra information for each defined entry in the aux file. I’ve added some extra example files in the glossaries performance page in gallery. In particular, those in Section 8 show a significant increase in overall build times for the large examples with complex builds.
Consider the makeindex 4000/6000 example that jumps from 53 seconds to 8.5 minutes simply by iterating over all entries and adding information to the aux file for every entry marked as used. (For reference, the bib2gls user manual has 4,506 glossary entries, so it’s not infeasible to consider documents with over 4000 entries.)
I think I will probably set \glsxtrresetcurrcountfalse as the default in v1.49 as you’re right it doesn’t make much sense to reset the count when the entry’s first use flag is unset.
Date: 2022-06-22 14:38:15
Repying to: Nicola Talbot 🦜 2022-06-22 12:38:00
Hi Nicola,
about \glsxtrresetcurrcountfalse, thank you.
Regarding:
I’m very reluctant to add extra information for each defined entry in the aux file.
my previous comment precisely granted that this shouldn't be done. And suggested an alternative of only writing the information of entries which have been "counted at least once along the document", regardless of the value of their first used flag at end of document.I understand that's what you actually tested there in Section 8. And I understand your reluctance, but I'll offer respectful counterpoint, in "brainstorming" spirit, and will simply respect whatever your call is after that.
Granted, "all defined entries" (actually, only the subset which contains entrycount is being considered already) should not get their data (possibly with many empty values) written to the .aux file. But if an user enabled counting, they would arguably want the data of entries which have been used in the document to be counted, even if there is a toll to it. Currently, glossaries-extra trims the entries by the state of their first used flag at end of document. What I suggested is to write the data for entries which have had their flags set at least once along the document. It is, obviously, a larger set. But, choosing the data to store by the state of the flag at an arbitrary point (end of document) discards information which is desirable, and cripples the counting feature somewhat, because you can no longer reset entries along the document, unless you can make sure they will somehow be unset again later on (or by running \glsunsetall which defeats any saving which you are trying to achieve). Sure, there's a cost, but that's the way the feature works. And I'm telling you this being an user who does care about the matter. One of the reasons I've chosen glossaries over acro was performance. The latter would have been more than sufficient for the technical needs of my current working project, but once I had things set up my compilation time for the document had trebled, and had become a bottleneck.
In sum, as things currently are, one should "avoid resetting" with entry counting. If one does obey this documented recommendation, the set of entries which gets written to the .aux file is identical to the set of entries which has "been used at least once along the document" because, if one has not reset any of them, they are unset at end of document. So, the cost would be the same. In other words, my suggestion of storing the data of every entry which has been used at least once in the document is just a way of making entry counting compatible with the use of reset along the document. If one does not obey the recommendation, what one gets is a somewhat smaller cost (still much larger than not enabling the counting feature), and potentially unreliable counting.
That's it. My best stab at it. :-)
Again, for your use, as you see fit, if you see fit.
Best regards,
Gustavo.
Date: 2022-06-22 15:37:05
Repying to: anonymous 2022-06-22 14:38:15
Have you considered using bib2gls with record counting instead? It's more reliable because it doesn't reference the first use flag but instead relies on the number of times an entry has been indexed (recorded). For example:
\documentclass[oneside]{book}
\begin{filecontents*}{\jobname.bib}
@abbreviation{ABC,short={ABC},long={Aaaa Bbbb Cccc}}
@abbreviation{DEF,short={DEF},long={Dddd Eeee Ffff}}
@abbreviation{GHI,short={GHI},long={Gggg Hhhh Iiii}}
\end{filecontents*}
\usepackage[
  record,
  abbreviations,
  shortcuts=abbr,
]{glossaries-extra}
\GlsXtrLoadResources[loc-counters=page]
\AddToHook{cmd/chapter/before}{\glsresetall}
\renewcommand{\ab}{\rgls}
\glsxtrenablerecordcount
\GlsXtrSetRecordCountAttribute{abbreviation}{1}
\renewcommand{\glslinkpresetkeys}{\glsadd[counter=chapter]{\glslabel}}
\renewcommand*{\glsxtrrecordtriggervalue}[1]{%
 \GlsXtrLocationRecordCount{#1}{chapter}{\thechapter}%
}
\begin{document}
\chapter{Chapter 1}
\ab{ABC}, \ab{ABC}, \ab{ABC}
\ab{DEF}
\ab{GHI}
\chapter{Chapter 2}
\ab{ABC}
\ab{DEF}, \ab{DEF}, \ab{DEF}
\ab{GHI}
\chapter{Chapter 3}
\ab{ABC}
\ab{DEF}
\ab{GHI}, \ab{GHI}, \ab{GHI}
\printunsrtabbreviations
\end{document}
If the document is called "myDoc.tex" then the build is:
pdflatex myDoc bib2gls --record-count-unit myDoc pdflatex myDoc(I've had to redefine
\ab but I'll adjust \glsxtrenablerecordcount so that it automatically switches the shortcut commands.)Note that if you want to use hyperref, you'll need to use \theHchapter instead of \thechapter:
\renewcommand*{\glsxtrrecordtriggervalue}[1]{%
 \GlsXtrLocationRecordCount{#1}{chapter}{\theHchapter}%
}
(Although for this particular example it won't make a difference.)This won't consider any instances of \glsadd unless you set the counter to chapter. For example:
\chapter{Chapter 1}
\ab{ABC}, \ab{ABC}, \ab{ABC}
\ab{DEF}
\ab{GHI}\glsadd{GHI}
This indexes GHI twice with the page counter but only once with the chapter counter, so it won't exceed the trigger value, but this is likely to be desirable since \glsadd doesn't produce any text. However, it will be tripped in the following:
\chapter{Chapter 1}
\ab{ABC}, \ab{ABC}, \ab{ABC}
\ab{DEF}
\ab{GHI} \glsxtrshort{GHI}
This differs for \cgls which isn't affected by \glsxtrshort.Date: 2022-06-22 16:21:13
Repying to: Nicola Talbot 🦜 2022-06-22 15:37:05
Hi Nicola,
Have you considered using bib2gls with record counting instead?
Oh, it is not about me. I actually dropped entry counting in my current project altogether.The things I suggested I did because the issue seemed relevant while I was trying things out, in thinking they might be a good idea for the package.
But, as I promised, I won't insist (further). ;-)
I thank you very much for the careful consideration granted to the suggestions. And I really do understand, and appreciate, the reasons for your reluctance on the matter.
Best regards,
Gustavo.
Add Comment
Page permalink: https://www.dickimaw-books.com/bugtracker.php?key=205




Date: 2022-06-10 14:29:06
I know it is documented behaviour, and I've granted as much in the report.
My point is that it would not be difficult to get an accurate count even when
\glsresetis used. Provided all the data is stored at end document, and the count is not reset.Again, I may well be missing reasons why things are how they are, I'm just starting to get better acquainted with
glossaries. I just brought an idea for your consideration, if it's a good one, it's up to you. :)Best regards,
Gustavo.