Unicode -(U+301) error in biblatex, but not in main text: {'{i}}












5














When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.










share|improve this question
























  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.
    – gusbrs
    2 hours ago










  • @gusbrs: what's a sourcemap? :)
    – Wiebke
    2 hours ago










  • Wiebke, see the answer moewe just provided. ;-)
    – gusbrs
    2 hours ago
















5














When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.










share|improve this question
























  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.
    – gusbrs
    2 hours ago










  • @gusbrs: what's a sourcemap? :)
    – Wiebke
    2 hours ago










  • Wiebke, see the answer moewe just provided. ;-)
    – gusbrs
    2 hours ago














5












5








5







When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.










share|improve this question















When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.







biblatex unicode latexmk






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 1 hour ago







Wiebke

















asked 2 hours ago









WiebkeWiebke

648413




648413












  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.
    – gusbrs
    2 hours ago










  • @gusbrs: what's a sourcemap? :)
    – Wiebke
    2 hours ago










  • Wiebke, see the answer moewe just provided. ;-)
    – gusbrs
    2 hours ago


















  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.
    – gusbrs
    2 hours ago










  • @gusbrs: what's a sourcemap? :)
    – Wiebke
    2 hours ago










  • Wiebke, see the answer moewe just provided. ;-)
    – gusbrs
    2 hours ago
















It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.
– gusbrs
2 hours ago




It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.
– gusbrs
2 hours ago












@gusbrs: what's a sourcemap? :)
– Wiebke
2 hours ago




@gusbrs: what's a sourcemap? :)
– Wiebke
2 hours ago












Wiebke, see the answer moewe just provided. ;-)
– gusbrs
2 hours ago




Wiebke, see the answer moewe just provided. ;-)
– gusbrs
2 hours ago










1 Answer
1






active

oldest

votes


















6














The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer























  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
    – gusbrs
    2 hours ago










  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
    – Dr. Manuel Kuehner
    2 hours ago












  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
    – Wiebke
    1 hour ago










  • @Wiebke To my knowledge almost no journals can accept biblatex submissions (tex.stackexchange.com/q/12175/35864). If you intend to publish this paper you should probably go back to simple BibTeX (maybe with natbib). If you decided where you want to publish, check their submission guidelines, they'll either have a ready-made .bst, will tell you to use a standard one or will want you to use thebibliography.
    – moewe
    59 mins ago










  • @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
    – moewe
    44 mins ago











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "85"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f469555%2funicode-u301-error-in-biblatex-but-not-in-main-text-i%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









6














The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer























  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
    – gusbrs
    2 hours ago










  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
    – Dr. Manuel Kuehner
    2 hours ago












  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
    – Wiebke
    1 hour ago










  • @Wiebke To my knowledge almost no journals can accept biblatex submissions (tex.stackexchange.com/q/12175/35864). If you intend to publish this paper you should probably go back to simple BibTeX (maybe with natbib). If you decided where you want to publish, check their submission guidelines, they'll either have a ready-made .bst, will tell you to use a standard one or will want you to use thebibliography.
    – moewe
    59 mins ago










  • @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
    – moewe
    44 mins ago
















6














The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer























  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
    – gusbrs
    2 hours ago










  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
    – Dr. Manuel Kuehner
    2 hours ago












  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
    – Wiebke
    1 hour ago










  • @Wiebke To my knowledge almost no journals can accept biblatex submissions (tex.stackexchange.com/q/12175/35864). If you intend to publish this paper you should probably go back to simple BibTeX (maybe with natbib). If you decided where you want to publish, check their submission guidelines, they'll either have a ready-made .bst, will tell you to use a standard one or will want you to use thebibliography.
    – moewe
    59 mins ago










  • @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
    – moewe
    44 mins ago














6












6








6






The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer














The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).







share|improve this answer














share|improve this answer



share|improve this answer








edited 1 hour ago

























answered 2 hours ago









moewemoewe

87.4k9110335




87.4k9110335












  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
    – gusbrs
    2 hours ago










  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
    – Dr. Manuel Kuehner
    2 hours ago












  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
    – Wiebke
    1 hour ago










  • @Wiebke To my knowledge almost no journals can accept biblatex submissions (tex.stackexchange.com/q/12175/35864). If you intend to publish this paper you should probably go back to simple BibTeX (maybe with natbib). If you decided where you want to publish, check their submission guidelines, they'll either have a ready-made .bst, will tell you to use a standard one or will want you to use thebibliography.
    – moewe
    59 mins ago










  • @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
    – moewe
    44 mins ago


















  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
    – gusbrs
    2 hours ago










  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
    – Dr. Manuel Kuehner
    2 hours ago












  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
    – Wiebke
    1 hour ago










  • @Wiebke To my knowledge almost no journals can accept biblatex submissions (tex.stackexchange.com/q/12175/35864). If you intend to publish this paper you should probably go back to simple BibTeX (maybe with natbib). If you decided where you want to publish, check their submission guidelines, they'll either have a ready-made .bst, will tell you to use a standard one or will want you to use thebibliography.
    – moewe
    59 mins ago










  • @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
    – moewe
    44 mins ago
















I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
2 hours ago




I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
2 hours ago












+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
2 hours ago






+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
2 hours ago














Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
1 hour ago




Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
1 hour ago












@Wiebke To my knowledge almost no journals can accept biblatex submissions (tex.stackexchange.com/q/12175/35864). If you intend to publish this paper you should probably go back to simple BibTeX (maybe with natbib). If you decided where you want to publish, check their submission guidelines, they'll either have a ready-made .bst, will tell you to use a standard one or will want you to use thebibliography.
– moewe
59 mins ago




@Wiebke To my knowledge almost no journals can accept biblatex submissions (tex.stackexchange.com/q/12175/35864). If you intend to publish this paper you should probably go back to simple BibTeX (maybe with natbib). If you decided where you want to publish, check their submission guidelines, they'll either have a ready-made .bst, will tell you to use a standard one or will want you to use thebibliography.
– moewe
59 mins ago












@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
44 mins ago




@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
44 mins ago


















draft saved

draft discarded




















































Thanks for contributing an answer to TeX - LaTeX Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f469555%2funicode-u301-error-in-biblatex-but-not-in-main-text-i%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Hivernacle

Fluorita

Hulsita