keyongtech


  keyongtech > ubuntu > 04/2008

 #1  
03-17-08, 04:37 PM
Ian Thompson-Bell
When composing an email I sometimes want to include special characters
like beat and mu in the text. How can I do this?

Cheers

Ian
 #2  
03-17-08, 05:08 PM
Mike Easter
Ian Thompson-Bell wrote:
> When composing an email I sometimes want to include special characters
> like beat and mu in the text. How can I do this?


Presuming s/beat/beta/...

The default charset of Tbird is ISO-8859-1 or Latin-1, an 8 bit charset.
That is also a likely configuration of your recipient's system.

That character set does not include the characters beta and mu. If you
are going to plaintext characters other than those included in your and
your recipients plaintext settings, you would have to mutually agree
upon a character set which included the characters of your wishes.

Depending on your 'aims', such as exchanging physics or math or
chemistry equations or whatever, you might be able to do it with some
choice of plaintext characterset, but likely you should be using
something other than plaintext.
 #3  
03-17-08, 06:27 PM
Ian Thompson-Bell
Mike Easter wrote:
> Ian Thompson-Bell wrote:
>> When composing an email I sometimes want to include special characters
>> like beat and mu in the text. How can I do this?

>
> Presuming s/beat/beta/...
>
> The default charset of Tbird is ISO-8859-1 or Latin-1, an 8 bit charset.
> That is also a likely configuration of your recipient's system.
>
> That character set does not include the characters beta and mu.


Are you certain of this. I have since discovered that I can include
and using Alt-Gr-s and m respectively as I hope you can see.

As Latin-1 is an 8 bit char set presumably it includes 256 characters.
In the old DOS days you could use an Alt sequence to obtain ones not
available directly via the keyboard. Is this feature not present in Linux?

Cheers

Ian
 #4  
03-17-08, 06:53 PM
Mike Easter
Ian Thompson-Bell wrote:
> Mike Easter wrote:
>
> Are you certain of this. I have since discovered that I can include
> and using Alt-Gr-s and m respectively as I hope you can see.
>
> As Latin-1 is an 8 bit char set presumably it includes 256 characters.
> In the old DOS days you could use an Alt sequence to obtain ones not
> available directly via the keyboard. Is this feature not present in
> Linux?


What is in the hex code of your message for the apparent chars mu and
beta are 0xDF & 0xB5 which are 223 and 181 which are 'extended ascii'
extended beyond the 128 of 7 bit ascii - where the 129-256 chars may be
interpreted 'variably' depending on the charset of the recipient.

As it turns out, I can read your chars in both MS's OE and /n/x Tbird,
and the iso names for the characters are 'szlig' and 'micro' -- and/but
if you actually need to use (more of) the charset of the Greek alphabet
in upper or lower case, you would have to use a different strategy.

So, yes, two default mode Tbird users would be able to interpret your
chars as apparently beta and mu, but that's not exactly what they are
and that's about it for 'Greek alphabet looking' chars.

It depends on what you are really trying to do.
 #5  
03-17-08, 08:50 PM
Ian Thompson-Bell
Mike Easter wrote:
> Ian Thompson-Bell wrote:
>
> What is in the hex code of your message for the apparent chars mu and
> beta are 0xDF & 0xB5 which are 223 and 181 which are 'extended ascii'
> extended beyond the 128 of 7 bit ascii - where the 129-256 chars may be
> interpreted 'variably' depending on the charset of the recipient.
>
> As it turns out, I can read your chars in both MS's OE and /n/x Tbird,
> and the iso names for the characters are 'szlig' and 'micro' -- and/but
> if you actually need to use (more of) the charset of the Greek alphabet
> in upper or lower case, you would have to use a different strategy.
>
> So, yes, two default mode Tbird users would be able to interpret your
> chars as apparently beta and mu, but that's not exactly what they are
> and that's about it for 'Greek alphabet looking' chars.
>
> It depends on what you are really trying to do.
>

In another newsgroup where the talk is about valves (vacuum tubes) and
negative feedback, those two particular symbols crop up regularly. I
simply want to be able to use those same characters. By chance I found
out how to get those particular two but what about the rest? The windows
guys use ALT nnn where nnn are three decimal digits to access characters
in what I believe used to be called the extended ASCII character set.
When discussing such matters cross platform so to speak I guess there is
no guarantee all users will employ the same character set but all the
Windows users seem to get identical results for ALT nnn. At least I
would like a way to be able to access the entire 256 characters in
Thunderbird from the keyboard.

Thanks for your help

Cheers

Ian
 #6  
03-17-08, 09:53 PM
Mike Easter
Ian Thompson-Bell wrote:

> In another newsgroup where the talk is about valves (vacuum tubes) and
> negative feedback, those two particular symbols crop up regularly. I
> simply want to be able to use those same characters. By chance I found
> out how to get those particular two but what about the rest? The
> windows guys use ALT nnn where nnn are three decimal digits to access
> characters in what I believe used to be called the extended ASCII
> character set.


The windows guys are performing like MS-centrics. The typical
windowsguy is using Western European Windows or Windows 1252 charset.
That charset is accessible from alt + numeric-keypad in windows.

> When discussing such matters cross platform so to
> speak I guess there is no guarantee all users will employ the same
> character set but all the Windows users seem to get identical results
> for ALT nnn. At least I would like a way to be able to access the
> entire 256 characters in Thunderbird from the keyboard.


Linux default method is to use an accessory charmap, similar to the one
Win has, which is actually less keystrokes than the alt-method but is
not as appealing to keyboarders. Keyboarders will need to create
themselves a compose key and then use that composekey+ letters combo to
make the extended chars.

http://bob.rasey.net/archives/141 How To Type Extended ASCII Characters
in Linux
 #7  
03-18-08, 01:51 AM
Wes Groleau
Mike Easter wrote:
> if you actually need to use (more of) the charset of the Greek alphabet
> in upper or lower case, you would have to use a different strategy.


Surely most Linux distros can handle UTF-8?

???(n? ho)
??????
???? ????????? ???????????????? ????????? (????)

??Written with ?Thunderbird?
 #8  
03-18-08, 11:49 AM
Ian Thompson-Bell
Mike Easter wrote:
> Linux default method is to use an accessory charmap, similar to the one
> Win has, which is actually less keystrokes than the alt-method but is
> not as appealing to keyboarders. Keyboarders will need to create
> themselves a compose key and then use that composekey+ letters combo to
> make the extended chars.
>


Ah, I just discover the Ubuntu char map under Accessories. I am not sure
how this work out as less key strokes but no matter; all I need is a
means to enter these characters and now I have it.


> [..] How To Type Extended ASCII Characters
> in Linux
>


Thanks for the link. It's a bit thin on details, especially the format
of the Compose file. I am surprised noone has created an emulation of
the Windows method.

Cheers

Ian
 #9  
03-18-08, 11:50 AM
Ian Thompson-Bell
Wes Groleau wrote:
> Mike Easter wrote:
>> if you actually need to use (more of) the charset of the Greek alphabet
>> in upper or lower case, you would have to use a different strategy.

>
> Surely most Linux distros can handle UTF-8?
>
> ???(n? ho)
> ??????
> ???? ????????? ???????????????? ????????? (????)
>
> ??Written with ?Thunderbird?
>


Yes, but HOW did you do that?

Cheers

Ian
 #10  
03-18-08, 12:03 PM
Rob van der Putten
Hi there


Ian Thompson-Bell wrote:

A nice UTF-8 test page;
http://www.unicode.org/iuc/iuc10/x-utf8.html

> Yes, but HOW did you do that?


I simply use cut and paste.


Regards,
Rob
 #11  
03-18-08, 12:50 PM
Mike Easter
Wes Groleau wrote:
> Mike Easter wrote:
>> if you actually need to use (more of) the charset of the Greek
>> alphabet in upper or lower case, you would have to use a different
>> strategy.

>
> Surely most Linux distros can handle UTF-8?


The problem with posting in extended characters is that -1- in a 'mixed'
population, you shouldn't assume MS-centric, /n/x-centric, or
Mac-centric and that -2- you shouldn't even assume that the /n/x-ers are
going to be using UTF-8, because typically the default is not UTF-8 and
many newsreaders users don't know how to reconfigure and -3- what you
UTF-8/ed below shows as total 'garbage' characters in a Windows system
and displays 'incompletely' in my Mepis Tbird which is default
configured not UTF-8 and also displays incompletely when I reconfigure
Tbird to use UTF-8 instead of ISO 8859-1.
 #12  
03-18-08, 01:23 PM
Ian Thompson-Bell
Rob van der Putten wrote:
> Hi there
>> Ian Thompson-Bell wrote:

>
> A nice UTF-8 test page;
> [..]
>
>> Yes, but HOW did you do that?

>
> I simply use cut and paste.
>

Cutting from what?

Cheers

Ian
 #13  
03-18-08, 02:17 PM
Rob van der Putten
Hi there


Ian Thompson-Bell wrote:

> Cutting from what?


I have a large (> 28000) collection of glyphs and their discription.
So for a subscript I just do 'grep -ih subscript *' ;

₀ U2080 /xe2/x82/x80 SUBSCRIPT ZERO
₁ U2081 /xe2/x82/x81 SUBSCRIPT ONE
₂ U2082 /xe2/x82/x82 SUBSCRIPT TWO
₃ U2083 /xe2/x82/x83 SUBSCRIPT THREE
₄ U2084 /xe2/x82/x84 SUBSCRIPT FOUR
₅ U2085 /xe2/x82/x85 SUBSCRIPT FIVE
₆ U2086 /xe2/x82/x86 SUBSCRIPT SIX
₇ U2087 /xe2/x82/x87 SUBSCRIPT SEVEN
₈ U2088 /xe2/x82/x88 SUBSCRIPT EIGHT
₉ U2089 /xe2/x82/x89 SUBSCRIPT NINE
₊ U208A /xe2/x82/x8a SUBSCRIPT PLUS SIGN
₋ U208B /xe2/x82/x8b SUBSCRIPT MINUS
₌ U208C /xe2/x82/x8c SUBSCRIPT EQUALS SIGN
₍ U208D /xe2/x82/x8d SUBSCRIPT LEFT PARENTHESIS
₎ U208E /xe2/x82/x8e SUBSCRIPT RIGHT PARENTHESIS

The same for math, superscript, etc.


Regards,
Rob
 #14  
03-18-08, 07:19 PM
Florian Diesch
Ian Thompson-Bell <ruffrecords> wrote:

> Mike Easter wrote:
>
> Ah, I just discover the Ubuntu char map under Accessories. I am not
> sure how this work out as less key strokes but no matter; all I need
> is a means to enter these characters and now I have it.
>> Thanks for the link. It's a bit thin on details, especially the format

> of the Compose file.


Here's my ~/.XCompose:

--8<---------------cut here---------------start------------->8---
# using some ideas from https://trac.aellaweil.de/ideen/wiki/XCompose

include "%L"

<Multi_key> <1> <4> : "¼" U00BC # VULGAR FRACTION ONE QUARTER
<Multi_key> <1> <2> : "½" U00BD # VULGAR FRACTION ONE HALF
<Multi_key> <3> <4> : "¾" U00BE # VULGAR FRACTION THREE QUARTERS

<Multi_key> <1> <3> : "⅓" U2153 # VULGAR FRACTION ONE THIRD
<Multi_key> <2> <3> : "⅔" U2154 # VULGAR FRACTION TWO THIRDS
<Multi_key> <1> <5> : "⅕" U2155 # VULGAR FRACTION ONE FIFTH
<Multi_key> <2> <5> : "⅖" U2156 # VULGAR FRACTION TWO FIFTHS
<Multi_key> <3> <5> : "⅗" U2157 # VULGAR FRACTION THREE FIFTHS
<Multi_key> <4> <5> : "⅘" U2158 # VULGAR FRACTION FOUR FIFTHS
<Multi_key> <1> <6> : "⅙" U2159 # VULGAR FRACTION ONE SIXTH
<Multi_key> <5> <6> : "⅚" U215A # VULGAR FRACTION FIVE SIXTHS
<Multi_key> <1> <8> : "⅛" U215B # VULGAR FRACTION ONE EIGHTH
<Multi_key> <3> <8> : "⅜" U215C # VULGAR FRACTION THREE EIGHTHS
<Multi_key> <5> <8> : "⅝" U215D # VULGAR FRACTION FIVE EIGHTHS
<Multi_key> <7> <8> : "⅞" U215E # VULGAR FRACTION SEVEN EIGHTHS



<Multi_key> <t> <m> : "™" U2122 # TRADE MARK SIGN

<Multi_key> <e> <e> : "∈" U2208 # ELEMENT OF
<Multi_key> <n> <e> : "∉" U2209 NOT AN ELEMENT OF

<Multi_key> <period> <minus> : "…" U2026 # HORIZONTAL ELLIPSIS
<Multi_key> <bar> <minus> : "†" U2020 # DAGGER
<Multi_key> <bar> <equal> : "‡" U2021 # DOUBLE DAGGER


<Multi_key> <minus> <less> : "←" U2190 # LEFTWARDS ARROW
<Multi_key> <minus> <asciicircum> : "↑" U2191 # UPWARDS ARROW
<Multi_key> <minus> <greater> : "→" U2192 # RIGHTWARDS ARROW
<Multi_key> <minus> <v> : "↓" U2193 # DOWNWARDS ARROW

<Multi_key> <equal> <less> : "⇐" U21D0 # LEFTWARDS DOUBLE ARROW
<Multi_key> <equal> <asciicircum> : "⇑" U21D1 # UPWARDS DOUBLE ARROW
<Multi_key> <equal> <greater> : "⇒" U21D2 # RIGHTWARDS DOUBLE ARROW
<Multi_key> <equal> <v> : "⇓" U21D3 # DOWNWARDS DOUBLE ARROW

<Multi_key> <0> <0> : "∞" U221E # INFINITY
<Multi_key> <m> <greater> : "≫" # U226B MUCH GREATER-THAN
<Multi_key> <m> <smaller> : "≪" # U226B MUCH GREATER-THAN

<Multi_key> <_> <0> : "₀" U2080 # SUBSCRIPT 0
<Multi_key> <_> <1> : "₁" U2081 # SUBSCRIPT 1
<Multi_key> <_> <2> : "₂" U2082 # SUBSCRIPT 2
<Multi_key> <_> <3> : "₃" U2083 # SUBSCRIPT 3
<Multi_key> <_> <4> : "₄" U2084 # SUBSCRIPT 4
<Multi_key> <_> <5> : "₅" U2085 # SUBSCRIPT 5
<Multi_key> <_> <6> : "₆" U2086 # SUBSCRIPT 6
<Multi_key> <_> <7> : "₇" U2087 # SUBSCRIPT 7
<Multi_key> <_> <8> : "₈" U2088 # SUBSCRIPT 8
<Multi_key> <_> <9> : "₉" U2089 # SUBSCRIPT 9

<Multi_key> <g> <a> : "α" U03B1 # GRREK SMALL LETTER ALPHA
<Multi_key> <g> <b> : "β" U03B2 # GRREK SMALL LETTER BETA
<Multi_key> <g> <c> : "γ" U03B3 # GRREK SMALL LETTER GAMMA
<Multi_key> <g> <d> : "δ" U03B4 # GRREK SMALL LETTER DELTA
<Multi_key> <g> <e> : "ε" U03B5 # GRREK SMALL LETTER EPSILON
<Multi_key> <g> <z> : "ζ" U03B6 # GRREK SMALL LETTER ZETA
<Multi_key> <g> <h> : "η" U03B7 # GRREK SMALL LETTER ETA
<Multi_key> <g> <g> : "θ" U03B8 # GRREK SMALL LETTER THETA
<Multi_key> <g> <i> : "ι" U03B9 # GRREK SMALL LETTER IOTA
<Multi_key> <g> <k> : "κ" U03BA # GRREK SMALL LETTER KAPPA
<Multi_key> <g> <l> : "λ" U03BB # GRREK SMALL LETTER LAMBDA
<Multi_key> <g> <m> : "μ" U03BC # GRREK SMALL LETTER MU
<Multi_key> <g> <n> : "ν" U03BD # GRREK SMALL LETTER NU
<Multi_key> <g> <f> : "ξ" U03BE # GRREK SMALL LETTER XI
<Multi_key> <g> <o> : "ο" U03BF # GRREK SMALL LETTER OMIKRON
<Multi_key> <g> <p> : "π" U03C0 # GRREK SMALL LETTER PI
<Multi_key> <g> <r> : "ρ" U03C1 # GRREK SMALL LETTER RHO
#<Multi_key> <g> <s> : "ς" U03C2 # GRREK SMALL LETTER FINAL SIGMA
<Multi_key> <g> <s> : "σ" U03C3 # GRREK SMALL LETTER SIGMA
<Multi_key> <g> <t> : "τ" U03C4 # GRREK SMALL LETTER TAU
<Multi_key> <g> <u> : "υ" U03C5 # GRREK SMALL LETTER UPSILON
<Multi_key> <g> <v> : "φ" U03C6 # GRREK SMALL LETTER PHI
<Multi_key> <g> <x> : "χ" U03C7 # GRREK SMALL LETTER CHI
<Multi_key> <g> <y> : "ψ" U03C8 # GRREK SMALL LETTER PSI
<Multi_key> <g> <w> : "ω" U03C9 # GRREK SMALL LETTER OMEGA


<Multi_key> <g> <A> : "Α" U03B1 # GRREK CAPITAL LETTER ALPHA
<Multi_key> <g> <B> : "Β" U03B2 # GRREK CAPITAL LETTER BETA
<Multi_key> <g> <C> : "Γ" U03B3 # GRREK CAPITAL LETTER GAMMA
<Multi_key> <g> <D> : "Δ" U03B4 # GRREK CAPITAL LETTER DELTA
<Multi_key> <g> <E> : "Ε" U03B5 # GRREK CAPITAL LETTER EPSILON
<Multi_key> <g> <Z> : "Ζ" U03B6 # GRREK CAPITAL LETTER ZETA
<Multi_key> <g> <H> : "Η" U03B7 # GRREK CAPITAL LETTER ETA
<Multi_key> <g> <G> : "Θ" U03B8 # GRREK CAPITAL LETTER THETA
<Multi_key> <g> <I> : "Ι" U03B9 # GRREK CAPITAL LETTER IOTA
<Multi_key> <g> <K> : "Κ" U03BA # GRREK CAPITAL LETTER KAPPA
<Multi_key> <g> <L> : "Λ" U03BB # GRREK CAPITAL LETTER LAMBDA
<Multi_key> <g> <M> : "Μ" U03BC # GRREK CAPITAL LETTER MU
<Multi_key> <g> <N> : "Ν" U03BD # GRREK CAPITAL LETTER NU
<Multi_key> <g> <F> : "Ξ" U03BE # GRREK CAPITAL LETTER XI
<Multi_key> <g> <O> : "Ο" U03BF # GRREK CAPITAL LETTER OMIKRON
<Multi_key> <g> <P> : "Π" U03C0 # GRREK CAPITAL LETTER PI
<Multi_key> <g> <R> : "Ρ" U03C1 # GRREK CAPITAL LETTER RHO
<Multi_key> <g> <S> : "Σ" U03C3 # GRREK CAPITAL LETTER SIGMA
<Multi_key> <g> <T> : "Τ" U03C4 # GRREK CAPITAL LETTER TAU
<Multi_key> <g> <U> : "Υ" U03C5 # GRREK CAPITAL LETTER UPSILON
<Multi_key> <g> <V> : "Φ" U03C6 # GRREK CAPITAL LETTER PHI
<Multi_key> <g> <X> : "Χ" U03C7 # GRREK CAPITAL LETTER CHI
<Multi_key> <g> <Y> : "Ψ" U03C8 # GRREK CAPITAL LETTER PSI
<Multi_key> <g> <W> : "Ω" U03C9 # GRREK CAPITAL LETTER OMEGA

--8<---------------cut here---------------end--------------->8---


> I am surprised noone has created an emulation of
> the Windows method.


Gnome has a "Shift with numpad keys works as in MS Windows" option in
its keyboard settings. But IMHO using a compose key is much more
userfriendly than having to learn all this character codes.


Florian
 #15  
03-18-08, 10:10 PM
Wes Groleau
Ian Thompson-Bell wrote:
> Wes Groleau wrote:
>> Surely most Linux distros can handle UTF-8?


I do most of my posting from a Mac, but I would
be surprised to hear there's not something similar
on Kubuntu, so I asked.

On the Mac:

>> ???


Select the Chinese input method, type the pinyin for a syllable,
hit space, and click the character from the menu that pops up.

>>?(n? ho)


Select the pinyin keyboard layout (which I wrote myself),
type ni1 ha4o

>> ??????


Select Japanese and begin typing romaji. After each syllable,
it changes to hiragana. When the software detects a full word,
it converts it to kanji if appropriate.

>> ???? ????????? ???????????????? ????????? (????)


I don't know much Russian, so I googled for CCCP, set lang
to Russian, and pasted this from one of the pages.

>> ??Written with ?Thunderbird?


Selected "Unicode hex input", typed 2026 twice with the alt
key held, typed "Written with ", alt-201c for the open quote,
"Thunderbird" and alt-201d for the end quote.

> Yes, but HOW did you do that?


See above. I would like to know the best way to do it on Kubuntu.

Similar Threads
Counting utf-8 characters -special characters

I have character counter for textarea wich counting the characters. Special character needs same place as two normal characters because of 16-bit encoding. Counter is...

How to convert HTML special characters to the real characters with a Java script

I read data (e.g. "') from my MySQL database which I'd like to show in an input box. <?php $mysql_data = "\"'"; $html_data =...

Replace special characters by non-special characters

i'm looking for a way to replace special characters with characters without accents, cedilles, etc.

Special characters

In my unmamaged code varaibles declaration can use special characters like - Dash _ Underscore (C# allows except this) $ Dollar sign # Number sign % Percent...

windows 2003 web edition. Websites with special characters, danish characters

Hello I am moving all our sites from windows 2000 server to windows 20003 web edition server. I have a few .nu sites with danish characters, but the iis6 does not seem to...


All times are GMT. The time now is 02:44 PM. | Privacy Policy