Bash script to truncate subject line of incoming emailBash Shell Script uses Sed to create and insert...

Why do neural networks need so many training examples to perform?

Why do we have to make "peinlich" start with a capital letter and also end with -s in this sentence?

Boss asked me to sign a resignation paper without a date on it along with my new contract

Removing whitespace between consecutive numbers

Can 5 Aarakocra PCs summon an Air Elemental?

Why did the villain in the first Men in Black movie care about Earth's Cockroaches?

How do I prevent a homebrew Grappling Hook feature from trivializing Tomb of Annihilation?

Has Britain negotiated with any other countries outside the EU in preparation for the exit?

Why is Agricola named as such?

Is a new boolean field better than null reference when a value can be meaningfully absent?

Is there any risk in sharing info about technologies and products we use with a supplier?

How do you funnel food off a cutting board?

Why did Luke use his left hand to shoot?

Why maximum length of IP, TCP, UDP packet is not suit?

Updating Statistics: Estimated Number of Rows not equal to Actual for Index Scan. Why?

Why do all the books in Game of Thrones library have their covers facing the back of the shelf?

Eww, those bytes are gross

I have trouble understanding this fallacy: "If A, then B. Therefore if not-B, then not-A."

How would you say "I like to go bowling" and other suru verbs?

Can you tell from a blurry photo if focus was too close or too far?

Non-Cancer terminal illness that can affect young (age 10-13) girls?

Potential client has a problematic employee I can't work with

What makes papers publishable in top-tier journals?

Is there a verb that means to inject with poison?



Bash script to truncate subject line of incoming email


Bash Shell Script uses Sed to create and insert multiple lines after a particular line in an existing fileBash script that can uncompress: tar, gzip, rarBash script to convert NIST vectors to debug scriptsBash script to send emails when web server does not respondNotification script | from RSS to Email | BashBash script for xrandr modificationBaby-sitting bash script using DVD driveSimple email validation scriptbash script - sed - template file processingBash script for managing hashtag notes













5












$begingroup$


I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


It will be deployed on a gnu/linux system. I'd appreciate any feedback.



#!/bin/bash
shopt -s extglob

usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
where="where: -s n == truncate the subject to n characters"
subject_length=''

while getopts :hs: opt; do
case $opt in
h) echo "$usage"; echo "$where"; exit ;;
s) subject_length=$OPTARG ;;
*) echo "Error: $usage" >&2; exit 1 ;;
esac
done
shift $((OPTIND - 1))

# validation
if [[ "$#" -eq 1 ]]; then
recipient=$1
else
echo "Error: $usage" >&2
exit 1
fi
if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
echo "Error: subject length must be a whole number"
exit 1
fi

sed_filters=()
if [[ -n $subject_length ]]; then
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
fi
# other filters can go here

if [[ ${#sed_filters[@]} > 0 ]]; then
cmd=( sed -E "${sed_filters[@]}" )
else
# no command line filters given
cmd=( cat )
fi

# now, filter the incoming email (on stdin) and pass to sendmail
"${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"









share|improve this question











$endgroup$

















    5












    $begingroup$


    I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



    alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


    It will be deployed on a gnu/linux system. I'd appreciate any feedback.



    #!/bin/bash
    shopt -s extglob

    usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
    where="where: -s n == truncate the subject to n characters"
    subject_length=''

    while getopts :hs: opt; do
    case $opt in
    h) echo "$usage"; echo "$where"; exit ;;
    s) subject_length=$OPTARG ;;
    *) echo "Error: $usage" >&2; exit 1 ;;
    esac
    done
    shift $((OPTIND - 1))

    # validation
    if [[ "$#" -eq 1 ]]; then
    recipient=$1
    else
    echo "Error: $usage" >&2
    exit 1
    fi
    if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
    echo "Error: subject length must be a whole number"
    exit 1
    fi

    sed_filters=()
    if [[ -n $subject_length ]]; then
    sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
    fi
    # other filters can go here

    if [[ ${#sed_filters[@]} > 0 ]]; then
    cmd=( sed -E "${sed_filters[@]}" )
    else
    # no command line filters given
    cmd=( cat )
    fi

    # now, filter the incoming email (on stdin) and pass to sendmail
    "${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"









    share|improve this question











    $endgroup$















      5












      5








      5





      $begingroup$


      I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



      alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


      It will be deployed on a gnu/linux system. I'd appreciate any feedback.



      #!/bin/bash
      shopt -s extglob

      usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
      where="where: -s n == truncate the subject to n characters"
      subject_length=''

      while getopts :hs: opt; do
      case $opt in
      h) echo "$usage"; echo "$where"; exit ;;
      s) subject_length=$OPTARG ;;
      *) echo "Error: $usage" >&2; exit 1 ;;
      esac
      done
      shift $((OPTIND - 1))

      # validation
      if [[ "$#" -eq 1 ]]; then
      recipient=$1
      else
      echo "Error: $usage" >&2
      exit 1
      fi
      if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
      echo "Error: subject length must be a whole number"
      exit 1
      fi

      sed_filters=()
      if [[ -n $subject_length ]]; then
      sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
      fi
      # other filters can go here

      if [[ ${#sed_filters[@]} > 0 ]]; then
      cmd=( sed -E "${sed_filters[@]}" )
      else
      # no command line filters given
      cmd=( cat )
      fi

      # now, filter the incoming email (on stdin) and pass to sendmail
      "${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"









      share|improve this question











      $endgroup$




      I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



      alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


      It will be deployed on a gnu/linux system. I'd appreciate any feedback.



      #!/bin/bash
      shopt -s extglob

      usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
      where="where: -s n == truncate the subject to n characters"
      subject_length=''

      while getopts :hs: opt; do
      case $opt in
      h) echo "$usage"; echo "$where"; exit ;;
      s) subject_length=$OPTARG ;;
      *) echo "Error: $usage" >&2; exit 1 ;;
      esac
      done
      shift $((OPTIND - 1))

      # validation
      if [[ "$#" -eq 1 ]]; then
      recipient=$1
      else
      echo "Error: $usage" >&2
      exit 1
      fi
      if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
      echo "Error: subject length must be a whole number"
      exit 1
      fi

      sed_filters=()
      if [[ -n $subject_length ]]; then
      sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
      fi
      # other filters can go here

      if [[ ${#sed_filters[@]} > 0 ]]; then
      cmd=( sed -E "${sed_filters[@]}" )
      else
      # no command line filters given
      cmd=( cat )
      fi

      # now, filter the incoming email (on stdin) and pass to sendmail
      "${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"






      validation bash linux email sed






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 1 hour ago









      200_success

      129k16153417




      129k16153417










      asked 4 hours ago









      glenn jackmanglenn jackman

      1,727710




      1,727710






















          2 Answers
          2






          active

          oldest

          votes


















          5












          $begingroup$

          Generally good code - plus points for good use of stdout/stderr and exit status.



          Shellcheck reported some issues:



          shellcheck -f gcc  214327.sh
          214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
          214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
          214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
          214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


          Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



          We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



          The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



          sed_filters=()

          if [[ -n $subject_length ]]
          then
          if [[ $subject_length != +([0-9]) ]]
          then
          echo "Error: subject length must be a whole number"
          exit 1
          fi

          sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
          fi

          # other filters can go here


          Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



          sed_filters=(-e '')

          # conditionally add to sed_filters

          # now, filter the incoming email (on stdin) and pass to sendmail
          sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


          sed with an empty program acts as cat.



          The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



          1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


          (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






          share|improve this answer











          $endgroup$





















            4












            $begingroup$



            Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





            • RFC 2822, Section 1.2.2: Header names are case-insensitive.


            • RFC 2822, Section 2.2.3: Header fields may be line-folded:




              2.2.3. Long Header Fields



              Each header field is logically a single line of characters
              comprising the field name, the colon, and the field body. For
              convenience however, and to deal with the 998/78 character
              limitations per line, the field body portion of a header field can
              be split into a multiple line representation; this is called
              "folding". The general rule is that wherever this standard allows
              for folding white space (not simply WSP characters), a CRLF may be
              inserted before any WSP. For example, the header field:



              Subject: This is a test


              can be represented as:



              Subject: This
              is a test



              Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



              What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




            • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



              Subject: this is some text


              … could also be represented physically as



              Subject: =?iso-8859-1?q?this=20is=20some=20text?=


              … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








            share|improve this answer











            $endgroup$













            • $begingroup$
              OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
              $endgroup$
              – glenn jackman
              59 mins ago













            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "196"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214327%2fbash-script-to-truncate-subject-line-of-incoming-email%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            5












            $begingroup$

            Generally good code - plus points for good use of stdout/stderr and exit status.



            Shellcheck reported some issues:



            shellcheck -f gcc  214327.sh
            214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
            214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
            214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
            214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


            Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



            We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



            The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



            sed_filters=()

            if [[ -n $subject_length ]]
            then
            if [[ $subject_length != +([0-9]) ]]
            then
            echo "Error: subject length must be a whole number"
            exit 1
            fi

            sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
            fi

            # other filters can go here


            Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



            sed_filters=(-e '')

            # conditionally add to sed_filters

            # now, filter the incoming email (on stdin) and pass to sendmail
            sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


            sed with an empty program acts as cat.



            The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



            1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


            (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






            share|improve this answer











            $endgroup$


















              5












              $begingroup$

              Generally good code - plus points for good use of stdout/stderr and exit status.



              Shellcheck reported some issues:



              shellcheck -f gcc  214327.sh
              214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
              214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
              214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
              214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


              Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



              We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



              The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



              sed_filters=()

              if [[ -n $subject_length ]]
              then
              if [[ $subject_length != +([0-9]) ]]
              then
              echo "Error: subject length must be a whole number"
              exit 1
              fi

              sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
              fi

              # other filters can go here


              Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



              sed_filters=(-e '')

              # conditionally add to sed_filters

              # now, filter the incoming email (on stdin) and pass to sendmail
              sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


              sed with an empty program acts as cat.



              The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



              1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


              (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






              share|improve this answer











              $endgroup$
















                5












                5








                5





                $begingroup$

                Generally good code - plus points for good use of stdout/stderr and exit status.



                Shellcheck reported some issues:



                shellcheck -f gcc  214327.sh
                214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
                214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
                214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
                214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


                Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



                We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



                The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



                sed_filters=()

                if [[ -n $subject_length ]]
                then
                if [[ $subject_length != +([0-9]) ]]
                then
                echo "Error: subject length must be a whole number"
                exit 1
                fi

                sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
                fi

                # other filters can go here


                Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



                sed_filters=(-e '')

                # conditionally add to sed_filters

                # now, filter the incoming email (on stdin) and pass to sendmail
                sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


                sed with an empty program acts as cat.



                The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



                1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


                (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






                share|improve this answer











                $endgroup$



                Generally good code - plus points for good use of stdout/stderr and exit status.



                Shellcheck reported some issues:



                shellcheck -f gcc  214327.sh
                214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
                214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
                214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
                214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


                Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



                We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



                The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



                sed_filters=()

                if [[ -n $subject_length ]]
                then
                if [[ $subject_length != +([0-9]) ]]
                then
                echo "Error: subject length must be a whole number"
                exit 1
                fi

                sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
                fi

                # other filters can go here


                Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



                sed_filters=(-e '')

                # conditionally add to sed_filters

                # now, filter the incoming email (on stdin) and pass to sendmail
                sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


                sed with an empty program acts as cat.



                The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



                1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


                (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 2 hours ago

























                answered 3 hours ago









                Toby SpeightToby Speight

                24.8k740115




                24.8k740115

























                    4












                    $begingroup$



                    Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





                    • RFC 2822, Section 1.2.2: Header names are case-insensitive.


                    • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                      2.2.3. Long Header Fields



                      Each header field is logically a single line of characters
                      comprising the field name, the colon, and the field body. For
                      convenience however, and to deal with the 998/78 character
                      limitations per line, the field body portion of a header field can
                      be split into a multiple line representation; this is called
                      "folding". The general rule is that wherever this standard allows
                      for folding white space (not simply WSP characters), a CRLF may be
                      inserted before any WSP. For example, the header field:



                      Subject: This is a test


                      can be represented as:



                      Subject: This
                      is a test



                      Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                      What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




                    • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                      Subject: this is some text


                      … could also be represented physically as



                      Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                      … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








                    share|improve this answer











                    $endgroup$













                    • $begingroup$
                      OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                      $endgroup$
                      – glenn jackman
                      59 mins ago


















                    4












                    $begingroup$



                    Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





                    • RFC 2822, Section 1.2.2: Header names are case-insensitive.


                    • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                      2.2.3. Long Header Fields



                      Each header field is logically a single line of characters
                      comprising the field name, the colon, and the field body. For
                      convenience however, and to deal with the 998/78 character
                      limitations per line, the field body portion of a header field can
                      be split into a multiple line representation; this is called
                      "folding". The general rule is that wherever this standard allows
                      for folding white space (not simply WSP characters), a CRLF may be
                      inserted before any WSP. For example, the header field:



                      Subject: This is a test


                      can be represented as:



                      Subject: This
                      is a test



                      Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                      What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




                    • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                      Subject: this is some text


                      … could also be represented physically as



                      Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                      … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








                    share|improve this answer











                    $endgroup$













                    • $begingroup$
                      OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                      $endgroup$
                      – glenn jackman
                      59 mins ago
















                    4












                    4








                    4





                    $begingroup$



                    Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





                    • RFC 2822, Section 1.2.2: Header names are case-insensitive.


                    • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                      2.2.3. Long Header Fields



                      Each header field is logically a single line of characters
                      comprising the field name, the colon, and the field body. For
                      convenience however, and to deal with the 998/78 character
                      limitations per line, the field body portion of a header field can
                      be split into a multiple line representation; this is called
                      "folding". The general rule is that wherever this standard allows
                      for folding white space (not simply WSP characters), a CRLF may be
                      inserted before any WSP. For example, the header field:



                      Subject: This is a test


                      can be represented as:



                      Subject: This
                      is a test



                      Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                      What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




                    • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                      Subject: this is some text


                      … could also be represented physically as



                      Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                      … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








                    share|improve this answer











                    $endgroup$





                    Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





                    • RFC 2822, Section 1.2.2: Header names are case-insensitive.


                    • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                      2.2.3. Long Header Fields



                      Each header field is logically a single line of characters
                      comprising the field name, the colon, and the field body. For
                      convenience however, and to deal with the 998/78 character
                      limitations per line, the field body portion of a header field can
                      be split into a multiple line representation; this is called
                      "folding". The general rule is that wherever this standard allows
                      for folding white space (not simply WSP characters), a CRLF may be
                      inserted before any WSP. For example, the header field:



                      Subject: This is a test


                      can be represented as:



                      Subject: This
                      is a test



                      Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                      What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




                    • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                      Subject: this is some text


                      … could also be represented physically as



                      Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                      … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.









                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited 1 hour ago

























                    answered 1 hour ago









                    200_success200_success

                    129k16153417




                    129k16153417












                    • $begingroup$
                      OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                      $endgroup$
                      – glenn jackman
                      59 mins ago




















                    • $begingroup$
                      OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                      $endgroup$
                      – glenn jackman
                      59 mins ago


















                    $begingroup$
                    OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                    $endgroup$
                    – glenn jackman
                    59 mins ago






                    $begingroup$
                    OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                    $endgroup$
                    – glenn jackman
                    59 mins ago




















                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Code Review Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214327%2fbash-script-to-truncate-subject-line-of-incoming-email%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    迭戈·戈丁...

                    A phrase ”follow into" in a context The 2019 Stack Overflow Developer Survey Results Are...

                    1960s short story making fun of James Bond-style spy fiction The 2019 Stack Overflow Developer...