Thanks for a pointer in the right direction. I hadn't even considered that 'encode' wasn't getting a pristine copy of the input. Turns out that process substitution silently truncates. After I discovered this I made a simple demonstration:
pitcher
#!/bin/bash
printf "This is output that needs to be
preserved exactly
"
catcher
#!/bin/bash
CATCH1=$(./pitcher)
echo CATCH1
echo -n $CATCH1
echo
echo
echo "---------------------------"
echo
CATCH2=$(./pitcher)
echo CATCH2
echo -n "$CATCH2"
echo
echo
echo "---------------------------"
echo
TMP=$(./pitcher; echo XXX) # Add our own 'end of input' sentinel.
CATCH3="${TMP%XXX}" # Del sentinel == exact copy of subprocess output.
echo CATCH3
echo -n "$CATCH3"
Output from executing catcher:
CATCH1
This is output that needs to be preserved exactly
---------------------------
CATCH2
This is output that needs to be
preserved exactly
---------------------------
CATCH3
This is output that needs to be
preserved exactly
Here's the updated script:
#!/bin/bash
# array of standard base64 characters
CHARS=($(echo -n "A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c \
d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 + /"))
IFS=' '
# standard character used for padding
# XXX needs implementation
PAD="="
# return input as a string of space-separated decimal numbers
function to_dec {
echo -n "$1" | hexdump -ve '/1 "%d "'
}
function encode {
TMP=$(to_dec $1; echo XXX)
typeset -a -r bytes=(${TMP%XXX}) # Read-only array of input as numbers.
typeset -i byte # For translating to a CHARS index.
typeset -i bytei # Index to current byte of input.
typeset -i chari # Index to char to translate to.
typeset -i left=0 # Bits from previous translation.
typeset -i a # tmp
typeset -i b # tmp
typeset encoded="" # Output buffer.
for ((bytei=0; bytei < ${#bytes[*]}; bytei++)); do
if [ $left -eq 0 ]; then
byte=${bytes[$bytei]}
chari=$(( $byte >> 2 ))
encoded=${encoded}${CHARS[$chari]}
left=2
elif [ $left -eq 6 ]; then
byte=${bytes[$(( $bytei - 1 ))]}
chari=$(( $byte & 63 ))
encoded=${encoded}${CHARS[$chari]}
byte=${bytes[$bytei]}
chari=$(( $byte >> 2 ))
encoded=${encoded}${CHARS[$chari]}
left=2
else
a=$(( (${bytes[$(( $bytei - 1 ))]}) & (2 ** $left - 1) ))
b=$(( (${bytes[$bytei]}) >> ($left + 2) ))
chari=$(( ($a << (6 - $left)) | $b ))
encoded=${encoded}${CHARS[$chari]}
left=$(( $left + 2 ))
fi
done
if [ $left -ne 0 ];then
byte=${bytes[$(( ${#bytes[*]} - 1 ))]}
chari=$(( ($byte & (2 ** $left - 1)) << (6 - $left) ))
encoded=${encoded}${CHARS[$chari]}
fi
echo -n "$encoded"
}
function decode {
echo "Not Implemented"
}
if [ $# -eq 0 ]; then
TMP=$(cat -; echo -n XXX)
F=${TMP%XXX}
echo $(encode $F)
else
echo "Wrong args..."
fi
$ echo hello | ./base64.sh | /usr/bin/base64 -d
hello
btw, I'm just ignoring proper padding until I get the core algorithms working. Thanks again.