Parsing a string of key-value pairs as a dictionary“Multi-key” dictionaryMatrix Multiplication Python —...

How can animals be objects of ethics without being subjects as well?

Does windows 10s telemetry include sending *.docs if word crashed

Explain the objections to these measures against human trafficking

Why Normality assumption in linear regression

Why did the villain in the first Men in Black movie care about Earth's Cockroaches?

How can my powered armor quickly replace its ceramic plates?

Early credit roll before the end of the film

How to prevent cleaner from hanging my lock screen in Ubuntu 16.04

Macro only to be defined in math mode

Why would space fleets be aligned?

Parsing a string of key-value pairs as a dictionary

Broken patches on a road

Can a person refuse a presidential pardon?

Why do members of Congress in committee hearings ask witnesses the same question multiple times?

Avoiding morning and evening handshakes

How can I install sudo without using su?

What is the purpose of easy combat scenarios that don't need resource expenditure?

Why avoid shared user accounts?

Is there any other number that has similar properties as 21?

Why is working on the same position for more than 15 years not a red flag?

Is there any differences between "Gucken" and "Schauen"?

Can we use the stored gravitational potential energy of a building to produce power?

how to acknowledge an embarrasing job interview, now that I work directly with the interviewer?

Dilemma of explaining to interviewer that he is the reason for declining second interview



Parsing a string of key-value pairs as a dictionary


“Multi-key” dictionaryMatrix Multiplication Python — Memory HungrySearch dictionary by valueLoad recurring (but not strictly identical) sets of Key, Values into a DataFrame from text filesInitializing and populating a Python dictionary, key -> ListList all possible permutations from a python dictionary of listsSort dictionary by increasing length of its valuesInvert a dictionary to a dictionary of listsAccessing a list of dictionaries in a list of dictionariesPytest fixture for testing a vertex-parsing function













1












$begingroup$


I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.



In [14]: data = """
41:n
43:n
44:n
46:n
47:n
49:n
50:n
51:n
52:n
53:n
54:n
55:cm
56:n
57:n
58:n"""
In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
Out [15]:
{41: 'n',
43: 'n',
44: 'n',
46: 'n',
47: 'n',
49: 'n',
50: 'n',
51: 'n',
52: 'n',
53: 'n',
54: 'n',
55: 'cm',
56: 'n',
57: 'n',
58: 'n'}


Here I am doing line.split(":")[0] three times. Is there any better way to do this?










share|improve this question











$endgroup$

















    1












    $begingroup$


    I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.



    In [14]: data = """
    41:n
    43:n
    44:n
    46:n
    47:n
    49:n
    50:n
    51:n
    52:n
    53:n
    54:n
    55:cm
    56:n
    57:n
    58:n"""
    In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
    Out [15]:
    {41: 'n',
    43: 'n',
    44: 'n',
    46: 'n',
    47: 'n',
    49: 'n',
    50: 'n',
    51: 'n',
    52: 'n',
    53: 'n',
    54: 'n',
    55: 'cm',
    56: 'n',
    57: 'n',
    58: 'n'}


    Here I am doing line.split(":")[0] three times. Is there any better way to do this?










    share|improve this question











    $endgroup$















      1












      1








      1





      $begingroup$


      I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.



      In [14]: data = """
      41:n
      43:n
      44:n
      46:n
      47:n
      49:n
      50:n
      51:n
      52:n
      53:n
      54:n
      55:cm
      56:n
      57:n
      58:n"""
      In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
      Out [15]:
      {41: 'n',
      43: 'n',
      44: 'n',
      46: 'n',
      47: 'n',
      49: 'n',
      50: 'n',
      51: 'n',
      52: 'n',
      53: 'n',
      54: 'n',
      55: 'cm',
      56: 'n',
      57: 'n',
      58: 'n'}


      Here I am doing line.split(":")[0] three times. Is there any better way to do this?










      share|improve this question











      $endgroup$




      I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.



      In [14]: data = """
      41:n
      43:n
      44:n
      46:n
      47:n
      49:n
      50:n
      51:n
      52:n
      53:n
      54:n
      55:cm
      56:n
      57:n
      58:n"""
      In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
      Out [15]:
      {41: 'n',
      43: 'n',
      44: 'n',
      46: 'n',
      47: 'n',
      49: 'n',
      50: 'n',
      51: 'n',
      52: 'n',
      53: 'n',
      54: 'n',
      55: 'cm',
      56: 'n',
      57: 'n',
      58: 'n'}


      Here I am doing line.split(":")[0] three times. Is there any better way to do this?







      python python-3.x parsing dictionary






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 4 hours ago









      200_success

      130k16153417




      130k16153417










      asked 8 hours ago









      Rahul PatelRahul Patel

      237413




      237413






















          3 Answers
          3






          active

          oldest

          votes


















          3












          $begingroup$

          You have too much logic in the dict comprehension:




          {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}



          First of all, let's expand it to a normal for-loop:



          >>> result = {}
          >>> for line in data.split("n"):
          ... if len(line.split(":"))==2:
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):



          >>> data.split("n")
          ['',
          '41:n',
          '43:n',
          ...
          '58:n']


          But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:



          >>> data.split()
          ['41:n',
          '43:n',
          ...
          '58:n']


          So, now we can remove unnecessary check from your code:



          >>> result = {}
          >>> for line in data.split():
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          Here you calculate line.split(":") twice. Take it out:



          >>> result = {}
          >>> for line in data.split():
          ... key, value = line.split(":")
          ... result[int(key)] = value
          >>> result


          This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:



          >>> def to_key_value(line, sep=':'):
          ... key, value = line.split(sep)
          ... return int(key), value

          >>> dict(map(to_key_value, data.split()))
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Another option that I came up with:



          >>> from functools import partial
          >>> lines = data.split()
          >>> split_by_colon = partial(str.split, sep=':')
          >>> key_value_pairs = map(split_by_colon, lines)
          >>> {int(key): value for key, value in key_value_pairs}
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?






          share|improve this answer









          $endgroup$













          • $begingroup$
            I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
            $endgroup$
            – Rahul Patel
            42 mins ago



















          2












          $begingroup$

          There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:



          In [10]: import re
          In [11]: data = """
          ...: 41:n
          ...: 43:n
          ...: 44:n
          ...: 46:n
          ...: 47:n
          ...: 49:n
          ...: 50:n
          ...: 51:n
          ...: 52:n
          ...: 53:n
          ...: 54:n
          ...: 55:cm
          ...: 56:n
          ...: 57:n
          ...: 58:n"""

          In [12]: dict(re.findall(r'(d+):(.*)', data))
          Out[12]:
          {'41': 'n',
          '43': 'n',
          '44': 'n',
          '46': 'n',
          '47': 'n',
          '49': 'n',
          '50': 'n',
          '51': 'n',
          '52': 'n',
          '53': 'n',
          '54': 'n',
          '55': 'cm',
          '56': 'n',
          '57': 'n',
          '58': 'n'}


          Explanation:



          1st Capturing Group (d+):



          d+ - matches a digit (equal to [0-9])
          + Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
          : matches the character : literally (case sensitive)



          2nd Capturing Group (.*):



          .* matches any character (except for line terminators)
          * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)



          If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:



          dict(re.findall(r'(.*):(.*)', data))


          I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.



          You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.





          Regarding the comment of @Rahul regarding speed I'd say it depends:



          Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:




          • How many times you parse the regex

          • How cleverly you write your string code

          • Whether the regex is precompiled


          As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.



          As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.






          share|improve this answer











          $endgroup$













          • $begingroup$
            Yeah. I think regexes are slow too.
            $endgroup$
            – Rahul Patel
            5 hours ago



















          2












          $begingroup$

          Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line



          You could use unpacking to remove some usages of line.split



          >>> dictionary = {
          ... int(k): v
          ... for line in data.split('n')
          ... for k, v in (line.split(':'),)
          ... if len(line.split(':')) == 2
          ... }
          >>> print(dictionary)
          {41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}


          Or if the first argument can be of str type you could use dict().



          This will unpack the line.split and convert them into a key, value pair for you



          >>> dictionary2 = dict(
          ... line.split(':')
          ... for line in data.split('n')
          ... if len(line.split(':')) == 2
          ... )
          >>> print(dictionary2)
          {'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}





          share|improve this answer









          $endgroup$













          • $begingroup$
            This is great. I was trying this but could nit figure out tuple thing. Thanks
            $endgroup$
            – Rahul Patel
            1 hour ago













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "196"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214510%2fparsing-a-string-of-key-value-pairs-as-a-dictionary%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3












          $begingroup$

          You have too much logic in the dict comprehension:




          {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}



          First of all, let's expand it to a normal for-loop:



          >>> result = {}
          >>> for line in data.split("n"):
          ... if len(line.split(":"))==2:
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):



          >>> data.split("n")
          ['',
          '41:n',
          '43:n',
          ...
          '58:n']


          But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:



          >>> data.split()
          ['41:n',
          '43:n',
          ...
          '58:n']


          So, now we can remove unnecessary check from your code:



          >>> result = {}
          >>> for line in data.split():
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          Here you calculate line.split(":") twice. Take it out:



          >>> result = {}
          >>> for line in data.split():
          ... key, value = line.split(":")
          ... result[int(key)] = value
          >>> result


          This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:



          >>> def to_key_value(line, sep=':'):
          ... key, value = line.split(sep)
          ... return int(key), value

          >>> dict(map(to_key_value, data.split()))
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Another option that I came up with:



          >>> from functools import partial
          >>> lines = data.split()
          >>> split_by_colon = partial(str.split, sep=':')
          >>> key_value_pairs = map(split_by_colon, lines)
          >>> {int(key): value for key, value in key_value_pairs}
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?






          share|improve this answer









          $endgroup$













          • $begingroup$
            I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
            $endgroup$
            – Rahul Patel
            42 mins ago
















          3












          $begingroup$

          You have too much logic in the dict comprehension:




          {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}



          First of all, let's expand it to a normal for-loop:



          >>> result = {}
          >>> for line in data.split("n"):
          ... if len(line.split(":"))==2:
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):



          >>> data.split("n")
          ['',
          '41:n',
          '43:n',
          ...
          '58:n']


          But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:



          >>> data.split()
          ['41:n',
          '43:n',
          ...
          '58:n']


          So, now we can remove unnecessary check from your code:



          >>> result = {}
          >>> for line in data.split():
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          Here you calculate line.split(":") twice. Take it out:



          >>> result = {}
          >>> for line in data.split():
          ... key, value = line.split(":")
          ... result[int(key)] = value
          >>> result


          This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:



          >>> def to_key_value(line, sep=':'):
          ... key, value = line.split(sep)
          ... return int(key), value

          >>> dict(map(to_key_value, data.split()))
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Another option that I came up with:



          >>> from functools import partial
          >>> lines = data.split()
          >>> split_by_colon = partial(str.split, sep=':')
          >>> key_value_pairs = map(split_by_colon, lines)
          >>> {int(key): value for key, value in key_value_pairs}
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?






          share|improve this answer









          $endgroup$













          • $begingroup$
            I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
            $endgroup$
            – Rahul Patel
            42 mins ago














          3












          3








          3





          $begingroup$

          You have too much logic in the dict comprehension:




          {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}



          First of all, let's expand it to a normal for-loop:



          >>> result = {}
          >>> for line in data.split("n"):
          ... if len(line.split(":"))==2:
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):



          >>> data.split("n")
          ['',
          '41:n',
          '43:n',
          ...
          '58:n']


          But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:



          >>> data.split()
          ['41:n',
          '43:n',
          ...
          '58:n']


          So, now we can remove unnecessary check from your code:



          >>> result = {}
          >>> for line in data.split():
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          Here you calculate line.split(":") twice. Take it out:



          >>> result = {}
          >>> for line in data.split():
          ... key, value = line.split(":")
          ... result[int(key)] = value
          >>> result


          This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:



          >>> def to_key_value(line, sep=':'):
          ... key, value = line.split(sep)
          ... return int(key), value

          >>> dict(map(to_key_value, data.split()))
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Another option that I came up with:



          >>> from functools import partial
          >>> lines = data.split()
          >>> split_by_colon = partial(str.split, sep=':')
          >>> key_value_pairs = map(split_by_colon, lines)
          >>> {int(key): value for key, value in key_value_pairs}
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?






          share|improve this answer









          $endgroup$



          You have too much logic in the dict comprehension:




          {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}



          First of all, let's expand it to a normal for-loop:



          >>> result = {}
          >>> for line in data.split("n"):
          ... if len(line.split(":"))==2:
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          I can see that you use the following check if len(line.split(":"))==2: to eliminate the first blank space from the data.split("n"):



          >>> data.split("n")
          ['',
          '41:n',
          '43:n',
          ...
          '58:n']


          But the docs for str.split advice to use str.split() without specifying a sep parameter if you wanna discard the empty string at the beginning:



          >>> data.split()
          ['41:n',
          '43:n',
          ...
          '58:n']


          So, now we can remove unnecessary check from your code:



          >>> result = {}
          >>> for line in data.split():
          ... result[int(line.split(":")[0])] = line.split(":")[1]
          >>> result


          Here you calculate line.split(":") twice. Take it out:



          >>> result = {}
          >>> for line in data.split():
          ... key, value = line.split(":")
          ... result[int(key)] = value
          >>> result


          This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:



          >>> def to_key_value(line, sep=':'):
          ... key, value = line.split(sep)
          ... return int(key), value

          >>> dict(map(to_key_value, data.split()))
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Another option that I came up with:



          >>> from functools import partial
          >>> lines = data.split()
          >>> split_by_colon = partial(str.split, sep=':')
          >>> key_value_pairs = map(split_by_colon, lines)
          >>> {int(key): value for key, value in key_value_pairs}
          {41: 'n',
          43: 'n',
          ...
          58: 'n'}


          Also, if you don't want to keep in memory a list of results from data.split, you might find this helpful: Is there a generator version of string.split() in Python?







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 55 mins ago









          GeorgyGeorgy

          9962520




          9962520












          • $begingroup$
            I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
            $endgroup$
            – Rahul Patel
            42 mins ago


















          • $begingroup$
            I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
            $endgroup$
            – Rahul Patel
            42 mins ago
















          $begingroup$
          I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
          $endgroup$
          – Rahul Patel
          42 mins ago




          $begingroup$
          I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
          $endgroup$
          – Rahul Patel
          42 mins ago













          2












          $begingroup$

          There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:



          In [10]: import re
          In [11]: data = """
          ...: 41:n
          ...: 43:n
          ...: 44:n
          ...: 46:n
          ...: 47:n
          ...: 49:n
          ...: 50:n
          ...: 51:n
          ...: 52:n
          ...: 53:n
          ...: 54:n
          ...: 55:cm
          ...: 56:n
          ...: 57:n
          ...: 58:n"""

          In [12]: dict(re.findall(r'(d+):(.*)', data))
          Out[12]:
          {'41': 'n',
          '43': 'n',
          '44': 'n',
          '46': 'n',
          '47': 'n',
          '49': 'n',
          '50': 'n',
          '51': 'n',
          '52': 'n',
          '53': 'n',
          '54': 'n',
          '55': 'cm',
          '56': 'n',
          '57': 'n',
          '58': 'n'}


          Explanation:



          1st Capturing Group (d+):



          d+ - matches a digit (equal to [0-9])
          + Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
          : matches the character : literally (case sensitive)



          2nd Capturing Group (.*):



          .* matches any character (except for line terminators)
          * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)



          If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:



          dict(re.findall(r'(.*):(.*)', data))


          I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.



          You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.





          Regarding the comment of @Rahul regarding speed I'd say it depends:



          Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:




          • How many times you parse the regex

          • How cleverly you write your string code

          • Whether the regex is precompiled


          As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.



          As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.






          share|improve this answer











          $endgroup$













          • $begingroup$
            Yeah. I think regexes are slow too.
            $endgroup$
            – Rahul Patel
            5 hours ago
















          2












          $begingroup$

          There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:



          In [10]: import re
          In [11]: data = """
          ...: 41:n
          ...: 43:n
          ...: 44:n
          ...: 46:n
          ...: 47:n
          ...: 49:n
          ...: 50:n
          ...: 51:n
          ...: 52:n
          ...: 53:n
          ...: 54:n
          ...: 55:cm
          ...: 56:n
          ...: 57:n
          ...: 58:n"""

          In [12]: dict(re.findall(r'(d+):(.*)', data))
          Out[12]:
          {'41': 'n',
          '43': 'n',
          '44': 'n',
          '46': 'n',
          '47': 'n',
          '49': 'n',
          '50': 'n',
          '51': 'n',
          '52': 'n',
          '53': 'n',
          '54': 'n',
          '55': 'cm',
          '56': 'n',
          '57': 'n',
          '58': 'n'}


          Explanation:



          1st Capturing Group (d+):



          d+ - matches a digit (equal to [0-9])
          + Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
          : matches the character : literally (case sensitive)



          2nd Capturing Group (.*):



          .* matches any character (except for line terminators)
          * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)



          If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:



          dict(re.findall(r'(.*):(.*)', data))


          I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.



          You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.





          Regarding the comment of @Rahul regarding speed I'd say it depends:



          Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:




          • How many times you parse the regex

          • How cleverly you write your string code

          • Whether the regex is precompiled


          As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.



          As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.






          share|improve this answer











          $endgroup$













          • $begingroup$
            Yeah. I think regexes are slow too.
            $endgroup$
            – Rahul Patel
            5 hours ago














          2












          2








          2





          $begingroup$

          There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:



          In [10]: import re
          In [11]: data = """
          ...: 41:n
          ...: 43:n
          ...: 44:n
          ...: 46:n
          ...: 47:n
          ...: 49:n
          ...: 50:n
          ...: 51:n
          ...: 52:n
          ...: 53:n
          ...: 54:n
          ...: 55:cm
          ...: 56:n
          ...: 57:n
          ...: 58:n"""

          In [12]: dict(re.findall(r'(d+):(.*)', data))
          Out[12]:
          {'41': 'n',
          '43': 'n',
          '44': 'n',
          '46': 'n',
          '47': 'n',
          '49': 'n',
          '50': 'n',
          '51': 'n',
          '52': 'n',
          '53': 'n',
          '54': 'n',
          '55': 'cm',
          '56': 'n',
          '57': 'n',
          '58': 'n'}


          Explanation:



          1st Capturing Group (d+):



          d+ - matches a digit (equal to [0-9])
          + Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
          : matches the character : literally (case sensitive)



          2nd Capturing Group (.*):



          .* matches any character (except for line terminators)
          * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)



          If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:



          dict(re.findall(r'(.*):(.*)', data))


          I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.



          You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.





          Regarding the comment of @Rahul regarding speed I'd say it depends:



          Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:




          • How many times you parse the regex

          • How cleverly you write your string code

          • Whether the regex is precompiled


          As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.



          As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.






          share|improve this answer











          $endgroup$



          There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:



          In [10]: import re
          In [11]: data = """
          ...: 41:n
          ...: 43:n
          ...: 44:n
          ...: 46:n
          ...: 47:n
          ...: 49:n
          ...: 50:n
          ...: 51:n
          ...: 52:n
          ...: 53:n
          ...: 54:n
          ...: 55:cm
          ...: 56:n
          ...: 57:n
          ...: 58:n"""

          In [12]: dict(re.findall(r'(d+):(.*)', data))
          Out[12]:
          {'41': 'n',
          '43': 'n',
          '44': 'n',
          '46': 'n',
          '47': 'n',
          '49': 'n',
          '50': 'n',
          '51': 'n',
          '52': 'n',
          '53': 'n',
          '54': 'n',
          '55': 'cm',
          '56': 'n',
          '57': 'n',
          '58': 'n'}


          Explanation:



          1st Capturing Group (d+):



          d+ - matches a digit (equal to [0-9])
          + Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
          : matches the character : literally (case sensitive)



          2nd Capturing Group (.*):



          .* matches any character (except for line terminators)
          * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)



          If there might be letters in the first matching group (though I doubt it since your casting that to an int), you might want to use:



          dict(re.findall(r'(.*):(.*)', data))


          I usually prefer using split()s over regexes because I feel like I have more control over the functionality of the code.



          You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.





          Regarding the comment of @Rahul regarding speed I'd say it depends:



          Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:




          • How many times you parse the regex

          • How cleverly you write your string code

          • Whether the regex is precompiled


          As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.



          As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 4 hours ago

























          answered 5 hours ago









          яүυкяүυк

          7,10122054




          7,10122054












          • $begingroup$
            Yeah. I think regexes are slow too.
            $endgroup$
            – Rahul Patel
            5 hours ago


















          • $begingroup$
            Yeah. I think regexes are slow too.
            $endgroup$
            – Rahul Patel
            5 hours ago
















          $begingroup$
          Yeah. I think regexes are slow too.
          $endgroup$
          – Rahul Patel
          5 hours ago




          $begingroup$
          Yeah. I think regexes are slow too.
          $endgroup$
          – Rahul Patel
          5 hours ago











          2












          $begingroup$

          Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line



          You could use unpacking to remove some usages of line.split



          >>> dictionary = {
          ... int(k): v
          ... for line in data.split('n')
          ... for k, v in (line.split(':'),)
          ... if len(line.split(':')) == 2
          ... }
          >>> print(dictionary)
          {41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}


          Or if the first argument can be of str type you could use dict().



          This will unpack the line.split and convert them into a key, value pair for you



          >>> dictionary2 = dict(
          ... line.split(':')
          ... for line in data.split('n')
          ... if len(line.split(':')) == 2
          ... )
          >>> print(dictionary2)
          {'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}





          share|improve this answer









          $endgroup$













          • $begingroup$
            This is great. I was trying this but could nit figure out tuple thing. Thanks
            $endgroup$
            – Rahul Patel
            1 hour ago


















          2












          $begingroup$

          Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line



          You could use unpacking to remove some usages of line.split



          >>> dictionary = {
          ... int(k): v
          ... for line in data.split('n')
          ... for k, v in (line.split(':'),)
          ... if len(line.split(':')) == 2
          ... }
          >>> print(dictionary)
          {41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}


          Or if the first argument can be of str type you could use dict().



          This will unpack the line.split and convert them into a key, value pair for you



          >>> dictionary2 = dict(
          ... line.split(':')
          ... for line in data.split('n')
          ... if len(line.split(':')) == 2
          ... )
          >>> print(dictionary2)
          {'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}





          share|improve this answer









          $endgroup$













          • $begingroup$
            This is great. I was trying this but could nit figure out tuple thing. Thanks
            $endgroup$
            – Rahul Patel
            1 hour ago
















          2












          2








          2





          $begingroup$

          Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line



          You could use unpacking to remove some usages of line.split



          >>> dictionary = {
          ... int(k): v
          ... for line in data.split('n')
          ... for k, v in (line.split(':'),)
          ... if len(line.split(':')) == 2
          ... }
          >>> print(dictionary)
          {41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}


          Or if the first argument can be of str type you could use dict().



          This will unpack the line.split and convert them into a key, value pair for you



          >>> dictionary2 = dict(
          ... line.split(':')
          ... for line in data.split('n')
          ... if len(line.split(':')) == 2
          ... )
          >>> print(dictionary2)
          {'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}





          share|improve this answer









          $endgroup$



          Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line



          You could use unpacking to remove some usages of line.split



          >>> dictionary = {
          ... int(k): v
          ... for line in data.split('n')
          ... for k, v in (line.split(':'),)
          ... if len(line.split(':')) == 2
          ... }
          >>> print(dictionary)
          {41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}


          Or if the first argument can be of str type you could use dict().



          This will unpack the line.split and convert them into a key, value pair for you



          >>> dictionary2 = dict(
          ... line.split(':')
          ... for line in data.split('n')
          ... if len(line.split(':')) == 2
          ... )
          >>> print(dictionary2)
          {'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 1 hour ago









          LudisposedLudisposed

          8,24222161




          8,24222161












          • $begingroup$
            This is great. I was trying this but could nit figure out tuple thing. Thanks
            $endgroup$
            – Rahul Patel
            1 hour ago




















          • $begingroup$
            This is great. I was trying this but could nit figure out tuple thing. Thanks
            $endgroup$
            – Rahul Patel
            1 hour ago


















          $begingroup$
          This is great. I was trying this but could nit figure out tuple thing. Thanks
          $endgroup$
          – Rahul Patel
          1 hour ago






          $begingroup$
          This is great. I was trying this but could nit figure out tuple thing. Thanks
          $endgroup$
          – Rahul Patel
          1 hour ago




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Code Review Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214510%2fparsing-a-string-of-key-value-pairs-as-a-dictionary%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          “%fieldName is a required field.”, in Magento2 REST API Call for GET Method Type The Next...

          How to change City field to a dropdown in Checkout step Magento 2Magento 2 : How to change UI field(s)...

          變成蝙蝠會怎樣? 參考資料 外部連結 导航菜单Thomas Nagel, "What is it like to be a...