Bash script to truncate subject line of incoming emailBash Shell Script uses Sed to create and insert...
Why do neural networks need so many training examples to perform?
Why do we have to make "peinlich" start with a capital letter and also end with -s in this sentence?
Boss asked me to sign a resignation paper without a date on it along with my new contract
Removing whitespace between consecutive numbers
Can 5 Aarakocra PCs summon an Air Elemental?
Why did the villain in the first Men in Black movie care about Earth's Cockroaches?
How do I prevent a homebrew Grappling Hook feature from trivializing Tomb of Annihilation?
Has Britain negotiated with any other countries outside the EU in preparation for the exit?
Why is Agricola named as such?
Is a new boolean field better than null reference when a value can be meaningfully absent?
Is there any risk in sharing info about technologies and products we use with a supplier?
How do you funnel food off a cutting board?
Why did Luke use his left hand to shoot?
Why maximum length of IP, TCP, UDP packet is not suit?
Updating Statistics: Estimated Number of Rows not equal to Actual for Index Scan. Why?
Why do all the books in Game of Thrones library have their covers facing the back of the shelf?
Eww, those bytes are gross
I have trouble understanding this fallacy: "If A, then B. Therefore if not-B, then not-A."
How would you say "I like to go bowling" and other suru verbs?
Can you tell from a blurry photo if focus was too close or too far?
Non-Cancer terminal illness that can affect young (age 10-13) girls?
Potential client has a problematic employee I can't work with
What makes papers publishable in top-tier journals?
Is there a verb that means to inject with poison?
Bash script to truncate subject line of incoming email
Bash Shell Script uses Sed to create and insert multiple lines after a particular line in an existing fileBash script that can uncompress: tar, gzip, rarBash script to convert NIST vectors to debug scriptsBash script to send emails when web server does not respondNotification script | from RSS to Email | BashBash script for xrandr modificationBaby-sitting bash script using DVD driveSimple email validation scriptbash script - sed - template file processingBash script for managing hashtag notes
$begingroup$
I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:
alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"
It will be deployed on a gnu/linux system. I'd appreciate any feedback.
#!/bin/bash
shopt -s extglob
usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
where="where: -s n == truncate the subject to n characters"
subject_length=''
while getopts :hs: opt; do
case $opt in
h) echo "$usage"; echo "$where"; exit ;;
s) subject_length=$OPTARG ;;
*) echo "Error: $usage" >&2; exit 1 ;;
esac
done
shift $((OPTIND - 1))
# validation
if [[ "$#" -eq 1 ]]; then
recipient=$1
else
echo "Error: $usage" >&2
exit 1
fi
if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters=()
if [[ -n $subject_length ]]; then
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
fi
# other filters can go here
if [[ ${#sed_filters[@]} > 0 ]]; then
cmd=( sed -E "${sed_filters[@]}" )
else
# no command line filters given
cmd=( cat )
fi
# now, filter the incoming email (on stdin) and pass to sendmail
"${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"
validation bash linux email sed
$endgroup$
add a comment |
$begingroup$
I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:
alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"
It will be deployed on a gnu/linux system. I'd appreciate any feedback.
#!/bin/bash
shopt -s extglob
usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
where="where: -s n == truncate the subject to n characters"
subject_length=''
while getopts :hs: opt; do
case $opt in
h) echo "$usage"; echo "$where"; exit ;;
s) subject_length=$OPTARG ;;
*) echo "Error: $usage" >&2; exit 1 ;;
esac
done
shift $((OPTIND - 1))
# validation
if [[ "$#" -eq 1 ]]; then
recipient=$1
else
echo "Error: $usage" >&2
exit 1
fi
if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters=()
if [[ -n $subject_length ]]; then
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
fi
# other filters can go here
if [[ ${#sed_filters[@]} > 0 ]]; then
cmd=( sed -E "${sed_filters[@]}" )
else
# no command line filters given
cmd=( cat )
fi
# now, filter the incoming email (on stdin) and pass to sendmail
"${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"
validation bash linux email sed
$endgroup$
add a comment |
$begingroup$
I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:
alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"
It will be deployed on a gnu/linux system. I'd appreciate any feedback.
#!/bin/bash
shopt -s extglob
usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
where="where: -s n == truncate the subject to n characters"
subject_length=''
while getopts :hs: opt; do
case $opt in
h) echo "$usage"; echo "$where"; exit ;;
s) subject_length=$OPTARG ;;
*) echo "Error: $usage" >&2; exit 1 ;;
esac
done
shift $((OPTIND - 1))
# validation
if [[ "$#" -eq 1 ]]; then
recipient=$1
else
echo "Error: $usage" >&2
exit 1
fi
if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters=()
if [[ -n $subject_length ]]; then
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
fi
# other filters can go here
if [[ ${#sed_filters[@]} > 0 ]]; then
cmd=( sed -E "${sed_filters[@]}" )
else
# no command line filters given
cmd=( cat )
fi
# now, filter the incoming email (on stdin) and pass to sendmail
"${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"
validation bash linux email sed
$endgroup$
I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:
alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"
It will be deployed on a gnu/linux system. I'd appreciate any feedback.
#!/bin/bash
shopt -s extglob
usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
where="where: -s n == truncate the subject to n characters"
subject_length=''
while getopts :hs: opt; do
case $opt in
h) echo "$usage"; echo "$where"; exit ;;
s) subject_length=$OPTARG ;;
*) echo "Error: $usage" >&2; exit 1 ;;
esac
done
shift $((OPTIND - 1))
# validation
if [[ "$#" -eq 1 ]]; then
recipient=$1
else
echo "Error: $usage" >&2
exit 1
fi
if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters=()
if [[ -n $subject_length ]]; then
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
fi
# other filters can go here
if [[ ${#sed_filters[@]} > 0 ]]; then
cmd=( sed -E "${sed_filters[@]}" )
else
# no command line filters given
cmd=( cat )
fi
# now, filter the incoming email (on stdin) and pass to sendmail
"${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"
validation bash linux email sed
validation bash linux email sed
edited 1 hour ago
200_success
129k16153417
129k16153417
asked 4 hours ago
glenn jackmanglenn jackman
1,727710
1,727710
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Generally good code - plus points for good use of stdout/stderr and exit status.
Shellcheck reported some issues:
shellcheck -f gcc 214327.sh
214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]
Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.
We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.
The repeated tests for [[ -n $subject_length ]] could be combined into a single block:
sed_filters=()
if [[ -n $subject_length ]]
then
if [[ $subject_length != +([0-9]) ]]
then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
fi
# other filters can go here
Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:
sed_filters=(-e '')
# conditionally add to sed_filters
# now, filter the incoming email (on stdin) and pass to sendmail
sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"
sed with an empty program acts as cat.
The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:
1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i
(Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).
$endgroup$
add a comment |
$begingroup$
Be sure to read the relevant RFCs that govern e-mail headers! Specifically:
RFC 2822, Section 1.2.2: Header names are case-insensitive.
RFC 2822, Section 2.2.3: Header fields may be line-folded:
2.2.3. Long Header Fields
Each header field is logically a single line of characters
comprising the field name, the colon, and the field body. For
convenience however, and to deal with the 998/78 character
limitations per line, the field body portion of a header field can
be split into a multiple line representation; this is called
"folding". The general rule is that wherever this standard allows
for folding white space (not simply WSP characters), a CRLF may be
inserted before any WSP. For example, the header field:
Subject: This is a test
can be represented as:
Subject: This
is a test
Since your
sedoperates on the raw representation of the header, you will miss headers that are logically longer thansubject_lengthcharacters long, but start with a physically short line.
What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.
RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line
Subject: this is some text
… could also be represented physically as
Subject: =?iso-8859-1?q?this=20is=20some=20text?=
… or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.
$endgroup$
$begingroup$
OK, I can useformail -czx subjectto extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
$endgroup$
– glenn jackman
59 mins ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214327%2fbash-script-to-truncate-subject-line-of-incoming-email%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Generally good code - plus points for good use of stdout/stderr and exit status.
Shellcheck reported some issues:
shellcheck -f gcc 214327.sh
214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]
Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.
We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.
The repeated tests for [[ -n $subject_length ]] could be combined into a single block:
sed_filters=()
if [[ -n $subject_length ]]
then
if [[ $subject_length != +([0-9]) ]]
then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
fi
# other filters can go here
Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:
sed_filters=(-e '')
# conditionally add to sed_filters
# now, filter the incoming email (on stdin) and pass to sendmail
sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"
sed with an empty program acts as cat.
The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:
1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i
(Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).
$endgroup$
add a comment |
$begingroup$
Generally good code - plus points for good use of stdout/stderr and exit status.
Shellcheck reported some issues:
shellcheck -f gcc 214327.sh
214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]
Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.
We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.
The repeated tests for [[ -n $subject_length ]] could be combined into a single block:
sed_filters=()
if [[ -n $subject_length ]]
then
if [[ $subject_length != +([0-9]) ]]
then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
fi
# other filters can go here
Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:
sed_filters=(-e '')
# conditionally add to sed_filters
# now, filter the incoming email (on stdin) and pass to sendmail
sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"
sed with an empty program acts as cat.
The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:
1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i
(Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).
$endgroup$
add a comment |
$begingroup$
Generally good code - plus points for good use of stdout/stderr and exit status.
Shellcheck reported some issues:
shellcheck -f gcc 214327.sh
214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]
Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.
We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.
The repeated tests for [[ -n $subject_length ]] could be combined into a single block:
sed_filters=()
if [[ -n $subject_length ]]
then
if [[ $subject_length != +([0-9]) ]]
then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
fi
# other filters can go here
Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:
sed_filters=(-e '')
# conditionally add to sed_filters
# now, filter the incoming email (on stdin) and pass to sendmail
sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"
sed with an empty program acts as cat.
The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:
1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i
(Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).
$endgroup$
Generally good code - plus points for good use of stdout/stderr and exit status.
Shellcheck reported some issues:
shellcheck -f gcc 214327.sh
214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]
Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.
We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.
The repeated tests for [[ -n $subject_length ]] could be combined into a single block:
sed_filters=()
if [[ -n $subject_length ]]
then
if [[ $subject_length != +([0-9]) ]]
then
echo "Error: subject length must be a whole number"
exit 1
fi
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
fi
# other filters can go here
Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:
sed_filters=(-e '')
# conditionally add to sed_filters
# now, filter the incoming email (on stdin) and pass to sendmail
sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"
sed with an empty program acts as cat.
The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:
1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i
(Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).
edited 2 hours ago
answered 3 hours ago
Toby SpeightToby Speight
24.8k740115
24.8k740115
add a comment |
add a comment |
$begingroup$
Be sure to read the relevant RFCs that govern e-mail headers! Specifically:
RFC 2822, Section 1.2.2: Header names are case-insensitive.
RFC 2822, Section 2.2.3: Header fields may be line-folded:
2.2.3. Long Header Fields
Each header field is logically a single line of characters
comprising the field name, the colon, and the field body. For
convenience however, and to deal with the 998/78 character
limitations per line, the field body portion of a header field can
be split into a multiple line representation; this is called
"folding". The general rule is that wherever this standard allows
for folding white space (not simply WSP characters), a CRLF may be
inserted before any WSP. For example, the header field:
Subject: This is a test
can be represented as:
Subject: This
is a test
Since your
sedoperates on the raw representation of the header, you will miss headers that are logically longer thansubject_lengthcharacters long, but start with a physically short line.
What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.
RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line
Subject: this is some text
… could also be represented physically as
Subject: =?iso-8859-1?q?this=20is=20some=20text?=
… or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.
$endgroup$
$begingroup$
OK, I can useformail -czx subjectto extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
$endgroup$
– glenn jackman
59 mins ago
add a comment |
$begingroup$
Be sure to read the relevant RFCs that govern e-mail headers! Specifically:
RFC 2822, Section 1.2.2: Header names are case-insensitive.
RFC 2822, Section 2.2.3: Header fields may be line-folded:
2.2.3. Long Header Fields
Each header field is logically a single line of characters
comprising the field name, the colon, and the field body. For
convenience however, and to deal with the 998/78 character
limitations per line, the field body portion of a header field can
be split into a multiple line representation; this is called
"folding". The general rule is that wherever this standard allows
for folding white space (not simply WSP characters), a CRLF may be
inserted before any WSP. For example, the header field:
Subject: This is a test
can be represented as:
Subject: This
is a test
Since your
sedoperates on the raw representation of the header, you will miss headers that are logically longer thansubject_lengthcharacters long, but start with a physically short line.
What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.
RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line
Subject: this is some text
… could also be represented physically as
Subject: =?iso-8859-1?q?this=20is=20some=20text?=
… or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.
$endgroup$
$begingroup$
OK, I can useformail -czx subjectto extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
$endgroup$
– glenn jackman
59 mins ago
add a comment |
$begingroup$
Be sure to read the relevant RFCs that govern e-mail headers! Specifically:
RFC 2822, Section 1.2.2: Header names are case-insensitive.
RFC 2822, Section 2.2.3: Header fields may be line-folded:
2.2.3. Long Header Fields
Each header field is logically a single line of characters
comprising the field name, the colon, and the field body. For
convenience however, and to deal with the 998/78 character
limitations per line, the field body portion of a header field can
be split into a multiple line representation; this is called
"folding". The general rule is that wherever this standard allows
for folding white space (not simply WSP characters), a CRLF may be
inserted before any WSP. For example, the header field:
Subject: This is a test
can be represented as:
Subject: This
is a test
Since your
sedoperates on the raw representation of the header, you will miss headers that are logically longer thansubject_lengthcharacters long, but start with a physically short line.
What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.
RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line
Subject: this is some text
… could also be represented physically as
Subject: =?iso-8859-1?q?this=20is=20some=20text?=
… or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.
$endgroup$
Be sure to read the relevant RFCs that govern e-mail headers! Specifically:
RFC 2822, Section 1.2.2: Header names are case-insensitive.
RFC 2822, Section 2.2.3: Header fields may be line-folded:
2.2.3. Long Header Fields
Each header field is logically a single line of characters
comprising the field name, the colon, and the field body. For
convenience however, and to deal with the 998/78 character
limitations per line, the field body portion of a header field can
be split into a multiple line representation; this is called
"folding". The general rule is that wherever this standard allows
for folding white space (not simply WSP characters), a CRLF may be
inserted before any WSP. For example, the header field:
Subject: This is a test
can be represented as:
Subject: This
is a test
Since your
sedoperates on the raw representation of the header, you will miss headers that are logically longer thansubject_lengthcharacters long, but start with a physically short line.
What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.
RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line
Subject: this is some text
… could also be represented physically as
Subject: =?iso-8859-1?q?this=20is=20some=20text?=
… or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.
edited 1 hour ago
answered 1 hour ago
200_success200_success
129k16153417
129k16153417
$begingroup$
OK, I can useformail -czx subjectto extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
$endgroup$
– glenn jackman
59 mins ago
add a comment |
$begingroup$
OK, I can useformail -czx subjectto extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
$endgroup$
– glenn jackman
59 mins ago
$begingroup$
OK, I can use
formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?$endgroup$
– glenn jackman
59 mins ago
$begingroup$
OK, I can use
formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?$endgroup$
– glenn jackman
59 mins ago
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214327%2fbash-script-to-truncate-subject-line-of-incoming-email%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown