You're asking for a sed
solution, but an awk
solution is simpler and performs better in this case, because you can easily split the line into 2 fields by =
and then selectively apply gsub()
to only the 1st field in order to replace the characters of interest:
$ awk -F= '{ gsub("[./-]", "_", $1); print $1 FS $2 }' <<< 'Int/domain-home.dir=/etc/int'
Int_domain_home_dir=/etc/int
-F=
tells awk
to split the input into fields by =
, which with the input at hand results in $1
(1st field) containing the first half of the line, before the =
, and $2
(2nd field) the 2nd half, after the =
; using the -F
option sets variable FS
, the input field separator.
gsub("[./-]", "_", $1)
globally replaces all characters in set [./-]
with _
in $1
- i.e., all occurrences of either .
, /
or -
in the 1st field are replaced with a _
each.
print $1 FS $2
prints the result: the modified 1st field ($1
), followed by FS
(which is =
), followed by the (unmodified) 2nd field ($2
).
Note that I've used ASCII char. -
(HYPHEN-MINUS, codepoint 0x2d
) in the awk
script, even though your sample input contains the Unicode char. —
(EM DASH, U+2014
, UTF-8 encoding 0xe2 0x80 0x94
).
If you really want to match that, simply substitute it in the command above, but note that the awk
version on macOS won't handle that properly.
Another option is to use iconv
with ASCII transliteration, which tranlates the em dash into a regular ASCII -
:
iconv -f utf-8 -t ascii//translit <<< 'Int/domain—home.dir=/etc/int' |
awk -F= '{ gsub("[./-]", "_", $1); print $1 FS $2 }'
perl
allows for an elegant solution too:
$ perl -F= -ane '$F[0] =~ tr|-/.|_|; print join("=", @F)' <<<'Int/domain-home.dir=/etc/int'
Int_domain_home_dir=/etc/int
-F=
, just like with Awk, tells Perl to use =
as the separator when splitting lines into fields
-ane
activates field splitting (a
), turns off implicit output (n
), and e
tells Perl that the next argument is an expression (command string) to execute.
The fields that each line is split into is stored in array @F
, where $F[0]
refers to the 1st field.
$F[0] =~ tr|-/.|-|
translates (replaces) all occurrences of -
, /
, and .
to _
.
print join("=", @F)
rebuilds the input line from the fields - with the 1st field now modified - and prints the result.
Depending on the Awk implementation used, this may actually be faster (see below).
That sed
isn't the best tool for this job is also reflected in the relative performance of the solutions:
Sample timings from my macOS 10.12 machine (GNU sed
4.2.2, Mawk awk
1.3.4, perl
v5.18.2, using input file file
, which contains 1 million copies of the sample input line) - take them with a grain of salt, but the ratios of the numbers are of interest; fastest solutions first:
# This answer's awk answer.
# Note: Mawk is much faster here than GNU Awk and BSD Awk.
$ time awk -F= '{ gsub("[./-]", "_", $1); print $1 FS $2 }' file >/dev/null
real 0m0.657s
# This answer's perl solution:
# Note: On macOS, this outperforms the Awk solution when using either
# GNU Awk or BSD Awk.
$ time perl -F= -ane '$F[0] =~ tr|-/.|_|; print join("=", @F)' file >/dev/null
real 0m1.656s
# Sundeep's perl solution with tr///
$ time perl -pe 's#^[^=]+#$&=~tr|/.-|_|r#e' file >/dev/null
real 0m2.370s
# Sundeep's perl solution with s///
$ time perl -pe 's#^[^=]+#$&=~s|[/.-]|_|gr#e' file >/dev/null
real 0m3.540s
# Cyrus' solution.
$ time sed 'h;s/[^=]*//;x;s/=.*//;s/[/.-]/_/g;G;s/
//' file >/dev/null
real 0m4.090s
# Kenavoz' solution.
# Note: The 3-byte UTF-8 em dash is NOT included in the char. set,
# for consistency of comparison with the other solutions.
# Interestingly, adding the em dash adds another 2 seconds or so.
$ time sed ':a;s/[-/.](.*=)/_1/;ta' file >/dev/null
real 0m9.036s
As you can see, the awk
solution is fastest by far, with the line-internal-loop sed
solution predictably performing worst, by a factor of about 12.